Closed mjpost closed 5 years ago
I feared this day would come.
I imagine it would be simple to generate these and add a button. I wonder if it will add a lot to the build time.
What we could also consider is to have this part done dynamically. We could run a web server with CGI on aclanthology.info and have it build the EndNote file for any URL that is accessed.
What do you think @mbollmann and @villalbamartin?
Rather than needing to “go the full EndNote” (and thus commence the game of non-BibTeX format whack-a-mole), one option would be to just add RIS output. EndNote can read that, I believe, as can basically everything else that isn’t BibTeX. I seem to recall that the format is a bit simpler than endnote’s, though I could be misremembering.
Oh, that's great to know. Bibutils supports RIS, too (xml2ris
).
@GabrielLin can you confirm that EndNote can read RIS format, and that that would therefore be sufficient?
If RIS is enough, then I'm all for it. Failing that, the same utility that generates RIS (xml2ris
) can also generate EndNote (xml2end
), so we could in theory have it supported as a lesser format. But the least formats we need to support, the better.
If we're going to generate the dynamically, I think we should make both available. If statically as part of Hugo, then we should just do RIS.
@mjpost , EndNote can import RIS format, but it needs some configurations. In some cases, it might fail to import just like the situation of importing .bib files. If having EndNote format, it can make sure the import successfully. As a result, I strongly suggest adding the EndNote format, just like the old version of ACL Anthology. Thanks.
@mbollmann, do you have any thoughts on this? Should we add 100k files to our export (EndNote + RIS)? Or take the dynamic approach?
I do worry about the technical debt of generating these dynamically offsite. On the other hand this might be more important than I give credit to as a non-user of either of these formats, and the implementation shouldn't be that hard.
@villalbamartin are you looking into this or are we tabling it?
I agree with @GabrielLin. We want to encourage other fields to cite our work so that we can position all of our authors as prominent scholars even outside of CL and NLP.
Having EndNote and/or RIS is useful for these folks, even if the files for the build might not be built on every run. Separately, you may want to start adding version numbers to bib files so that they can be tracked if they get built and differ from the current ones. I'll post a new issue for that if you'd like.
A dynamic approach would essentially mean writing a wrapper around bibutils. It sounds pretty simple, but it does add another layer of complexity and maintenance cost. I'm not sure if there's any significant cost to generating 100k+ more files, but it's relatively trivial to implement, so there's little cost to trying (and potentially reverting) it.
Let's try that [edit: meaning static generation] first. What if we just did Endnote to start with? Who/what uses RIS?
RefWorks, Mendeley, etc., though as far as I know both of those can also read and write bibtex. I guess you could say that RIS is more "cross-platform" than EndNote, but I don't know if the marginal utility of including RIS is worth the extra files, given that the technical difference between creating RIS vs. Endnote is nonexistent. I hadn't realized, before, that it was the same command-line utility producing both, and my earlier advocacy for RIS was based on the thought that it's a simpler format to generate- but given that we're using existing tools, it's a moot point.
It looks to me as if the bibutils
should work out of the box, at least in theory. If it's okay with everyone I will take care of this.
That'd be great! As static files, right?
Yes, I'm trying to do it with the very minimum, using the files generated by the bib2xml_wrapper
script.
I am currently having problems compiling the full version in my computer, though, so I'm opening a ticket about that.
Small update on this question:
I wonder if it will add a lot to the build time.
In my underpowered laptop, the step "Converting BibTeX files to MODS XML" takes 02:19min. Generating the EndNote files in the exact same fashion takes 06:03min. Converting BibTeX to MODS can be done at 357 files/sec, while MODS to EndNote runs at 136 files/sec.
I am testing a version of the code right now that implements the GUI aspects. I expect to make a pull request by tomorrow.
Does it seem that the EndNote button disappear?
This was a build error and has been fixed.
Thank you @mjpost
I argue you to get back the EndNote importer file. Thanks.
Originally posted by @GabrielLin in https://github.com/acl-org/acl-anthology/issues/170#issuecomment-479816995