retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.43k stars 291 forks source link

inspire-hep citekey #1110

Closed BenjaminDbb closed 5 years ago

BenjaminDbb commented 5 years ago

Thank you for the most brilliant addon in the Zotero!

TLDR: choice of generate citekeys: inspire-hep texkey and BibTeX

I enjoy using BBT in my own work, but recently I meet some problem when plan to write a paper with other collaborators with the different citation key for the first time.

Most of physicists are used to citing the paper in the form of inspire-hep(both citekey and BibTeX). inspire-hep and ADS are the most popular database in the physics, but there is no change of inspire-hep citation key whatever the author and title of a paper have been modified. Many people prefer to use inspire-hep, but I found BBT has no choice to generate it.

  1. the form of inspire-hep texkey: citation key It’s hard to deal with 3 random letters with Configurable citekey generator. So, I wonder if you can add this function to BBT.

  2. Then, I find a python package pyinspire from http://inspirehep.net/info/hep/tools/index?ln=en ,I test it, and work well for now.

    arXiv number test: arXiv: 1809.07673

    python pyinspire.py -s 1809.07673 -b

    @article{Bar-Or:2018pxz, ​ author = "Bar-Or, Ben and Fouvry, Jean-Baptiste and Tremaine, ​ Scott", ​ title = "{Relaxation in a Fuzzy Dark Matter Halo}", ​ year = "2018", ​ eprint = "1809.07673", ​ archivePrefix = "arXiv", ​ primaryClass = "astro-ph.GA", ​ SLACcitation = "%%CITATION = ARXIV:1809.07673;%%" }

    DOI number test: DOI: 10.1073/pnas.1308716112

    python pyinspire.py -s 10.1073/pnas.1308716112 -b

    @article{Weinberg:2013aya, ​ author = "Weinberg, David H. and Bullock, James S. and Governato, ​ Fabio and Kuzio de Naray, Rachel and Peter, Annika H. G.", ​ title = "{Cold dark matter: controversies on small scales}", ​ booktitle = "{Sackler Colloquium: Dark Matter Universe: On the ​ Threshhold of Discovery Irvine, USA, October 18-20, 2012}", ​ journal = "Proc. Nat. Acad. Sci.", ​ volume = "112", ​ year = "2015", ​ pages = "12249-12255", ​ doi = "10.1073/pnas.1308716112", ​ eprint = "1306.0913", ​ archivePrefix = "arXiv", ​ primaryClass = "astro-ph.CO", ​ SLACcitation = "%%CITATION = ARXIV:1306.0913;%%" }

    DOI: 10.1103/PhysRevLett.85.1158

    python pyinspire.py -s 10.1103/PhysRevLett.85.1158 -b

    @article{Hu:2000ke, ​ author = "Hu, Wayne and Barkana, Rennan and Gruzinov, Andrei", ​ title = "{Cold and fuzzy dark matter}", ​ journal = "Phys. Rev. Lett.", ​ volume = "85", ​ year = "2000", ​ pages = "1158-1161", ​ doi = "10.1103/PhysRevLett.85.1158", ​ eprint = "astro-ph/0003365", ​ archivePrefix = "arXiv", ​ primaryClass = "astro-ph", ​ SLACcitation = "%%CITATION = ASTRO-PH/0003365;%%" }

Could you add this key generation and bibtex form to the BBT?

Thanks

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3250 ("add key to pattern formatter")

Install in Zotero by downloading test build 5.1.33.3250, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

With 3250, [auth]:[year][key:lower:substring=1,3] should do AFAICT. key is the internal zotero key of the item; it is a random, 6-character alphanumeric string which won't change regardless of what you change on the item (but it can also not be set manually, so when you merge two items, you end up with the key of the merge parent you chose). It is uppercase, so you'd want it lowercase, and you only want the first 3, hence the call to substring.

With question 2, I'm not sure what you want changed in BBT. What do you mean by "form"?

BTW, the title and booktitle in the results from pyinspire.py you post above is almost certainly wrong. The "{...}" format is going to get the wrong results with citation styles that demand Title Case.

BenjaminDbb commented 5 years ago

I try to use the test build 5.1.33.3250, and [auth]:[year][key:lower:substring=1,3] I test the $p\ddot{u}rrer's$ paper "Frequency domain reduced order model of aligned-spin effective-one-body waveforms with generic mass-ratios and spins" whose DOI is 10.1103/PhysRevD.93.064041.

When I use BBT, key is Purrer:20162zm, but what I really want is Purrer:2015tud Inspire-hep looks like an administrator, creating a unique ID(citation key and BibTeX), and used for common agreement for every paper. pic1 pic2

BibTeX of this paper pic4

retorquere commented 5 years ago

You're getting Purrer:20162zm, because the date in the reference is 2016, not 2015. I don't know by what means inspire generates the tud, but if it's actually random, I can of course not replicate that.

I will take a look to see if the python script has some clues on how to get the inspire key, but what would I use as the search key? In Zotero, not all items have a DOI field. For the short term while I figure this out, you might be best off with just pinning the key manually by adding bibtex: Purrer:2015tud to the extra field.

As to the 2nd part, I still don't understand what you want BBT to do. I already generate BibTeX that looks mostly like what's above, except I don't do the "{...} around titles to get something like "{Frequency domain reduced order model of aligned-spin effective-one-body waveforms with generic mass ratios and spins}" because it's just wrong. It should be {Frequency Domain Reduced Order Model of Aligned-spin Effective-one-body Waveforms with Generic Mass Ratios and Spins} because BibTeX expects titles to be in Title Case and will downcase them when required by the style.

To add the arXiV related fields, you can add arXiv:1512.02248 [gr-gc] to the extra field and BBT will export that to

  archivePrefix = {arXiv},
  eprint = {1512.02248},
  eprinttype = {arxiv},
  primaryClass = {gr-gc},

I can add the SLACcitation if that's a standardized format but I haven't been able to find a specification for it so far, any pointers?

retorquere commented 5 years ago

There is a possibility to get the inspire-hep key but by $DEITY inspire-hep is slow; it will take about 0.8-1.1sec per key. I'm looking into it.

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3252 ("fetch inspireHEP key")

Install in Zotero by downloading test build 5.1.33.3252, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

Try 3252; there's an extra option in the context menu when you right-click references which says Pin BibTeX key from InspireHEP. When you select that, BBT will look up the key for all selected entries and pin them by adding them to the extra field. If there's a DOI filled out, it will use that; alternately, if there's a DOI: ... or an arXiv: line in the extra field, it will use the first one it finds.

Using this key, it will search inspireHEP to see if it can find an exact match, and will use that.

If it cannot find a DOI or arXiv key in the DOI or the extra field, the item will be skipped over.

If it can find a search key, but inspireHEP doesn't return results or it returns an error, it will be skipped over.

If it can find a search key and inspireHEP returns 0 or more than 1 results, it will be skipped over

if it can find a search key and inspireHEP returns exactly one result but that doesn't have a TeX citation key, it will be skipped over.

Most skips will leave a message in the error log.

retorquere commented 5 years ago

Note that if you select a large number of items, it will be pretty slow, that's out of my control, the inspire API is just slow.

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3253 ("pick up DOI from extra for inspire-hep")

Install in Zotero by downloading test build 5.1.33.3253, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3254 ("cleanup")

Install in Zotero by downloading test build 5.1.33.3254, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3262 ("slacccitation")

Install in Zotero by downloading test build 5.1.33.3262, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

3262 adds a new hidden preference called SLACcitation which, when set to true, will export the SLACcitation field if there's arXiv info in the item.

I'm not wholly sold on this SLACcitation yet though. I can find very little info on it, it seems REVTeX specific, and it seems to duplicate info that's already in other fields. If you can find more info on what it's supposed to do that would be most welcome.

BenjaminDbb commented 5 years ago

I test the build both 5.1.33.3252, 5.1.33.3254 and 5.1.33.3262, try to generate Pin BibTeX key from InspireHEP and only 5.1.33.3252 works

It seems like a bug with the vision of 5.1.33.3254 and 5.1.33.3262 More, could BBT export BibTeX like the InspireHEP style? like BibTeX of this paper

When I export the BibTeX with BBT (citation 23): pic7

When I use InspireHEP BibTeX (citation 23): pic6

What about store all the BibTeX info in the zotero when get the citation key from InspireHEP? and copy it to *.bib file when export items from library?

BTW, I think it will be better to explain the motivation for InspireHEP BibTeX key. For example, when you finish or almost finish a paper in 2015, you might upload your preprint in arXiv. the key generated by BBT is someone1 title1 2015, but it will be changed when published in 2016, changed to someone2 title2 2016 with the revision of paper. It confuses lots of people when cited in paper. Finally, We find the InspiedHEP BibTeX key never changed once generated, so it's an unwritten agreement to use InspiedHEP BibTeX key in the field of physics.

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3263 ("extraFields, not extraKeys")

Install in Zotero by downloading test build 5.1.33.3263, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

3263 fixes the pin from inspireHEP. I have no problem with that feature, I'm just not wholly convinced about exporting the SLACcitation field, which looks to be a REVTeX-specific field for a specific journal... that's perhaps best done with a postscript.

For the 2nd part, can you:

  1. Right-click that Purrer reference and send a BBT error report (and post the debug ID here)
  2. Create an MWE (example below) that reproduces your desired output, preferably in a shared Overleaf document (you can share publicly for this)
\documentclass{article}

\usepackage{filecontents}

\begin{filecontents}{\jobname.bib}
@article{Purrer:2015tud,
      author         = "Pürrer, Michael",
      title          = "{Frequency domain reduced order model of aligned-spin
                        effective-one-body waveforms with generic mass-ratios and
                        spins}",
      journal        = "Phys. Rev.",
      volume         = "D93",
      year           = "2016",
      number         = "6",
      pages          = "064041",
      doi            = "10.1103/PhysRevD.93.064041",
      eprint         = "1512.02248",
      archivePrefix  = "arXiv",
      primaryClass   = "gr-qc",
      SLACcitation   = "%%CITATION = ARXIV:1512.02248;%%"
}
\end{filecontents}

\begin{document}

From \cite{Purrer:2015tud} we see \ldots

\bibliographystyle{alpha}
\bibliography{\jobname}
\end{document}
retorquere commented 5 years ago

Also, I still haven't found any info on the SLACcitation field. I see some samples, but some add the primaryClass to the SLACcitation, some don't. Based on the samples I can't really do much since they're ambiguous.

retorquere commented 5 years ago

And sometimes they look like SLACcitation = "%%CITATION = HEP-TH/9711200;%%", another time they look like SLACcitation = "%%CITATION = ARXIV:1512.02248;%%" (note the ARXIV: part). The samples I happen to find don't really illustrate what they should look like, and I really can't find a spec for this field. It certainly isn't in BibTeXing or Tame the BeaST (which is as close to a spec for BibTeX as you'll find).

The HyperTeX FAQ links to what they call the BibTeX page on arXiv.org, which doesn't mention SLACcitation at all. In my experience, arXiv doesn't have good documentation for the BibTeX stuff.

BenjaminDbb commented 5 years ago

Thank you, Could you save all the BibTeX info(without parse) from InspireHEP in the zotero when get the citation key? Then, copy it to *.bib file when export items from library?

retorquere commented 5 years ago

I'm getting the idea that SLACcitation isn't standardized at all, it looks to be a free-form field in REVTeX that just gets included in the output whatever you put in there.

retorquere commented 5 years ago

Thank you, Could you save all the BibTeX info(without parse) from InspireHEP in the zotero when get the citation key? Then, copy it to *.bib file when export items from library?

No? Where would I put it?

BenjaminDbb commented 5 years ago

save the info in BBT cache file?

indeed, SLACcitation is not standardized. but the different of SLACcitation = "%%CITATION = HEP-TH/9711200;%%" and SLACcitation = "%%CITATION = ARXIV:1512.02248;%%" result from the old-style and new-style preprint numbers in arXiv. Could you create a hidden field SLACcitation, and just save it without parse?

or, I think it's enough to generate citekey for me. I can try to write the python script transforming *.bib file exported from BBT into InspireHEP form.

Thanks again

retorquere commented 5 years ago

The cache is to store what BBT generates from the Zotero item; it gets cleared and recreated on various occasions (such as pref changes, item changes and BBT upgrades), so that would mean I'd have to mark these items as special and re-fetch from inspire when that happens. I re-fill the cache during export, and during export, Zotero doesn't allow me to do web requests, so that would be a major architectural change to BBT to fix. Also, if you have an item in Zotero, grab the inspire-heb info, and then start editing the item, the cache would no longer reflect what you can see in Zotero, and the user cannot see this discrepancy.

This is all too much departure from what BBT aims to do, which revolves around items in the Zotero database; getting BibTeX into Zotero items as best it can, and from Zotero items generate the best BibTeX possible. If you really want unchanged BibTeX, you're better off with something like JabRef. You're looking to cut out Zotero entirely. If you want to have a way to best re-create inspire-HEP BibTeX from Zotero items, I can help, but I don't aim to supplant JabRef as a plain-bibtex manager.

If you fill out the MWE, I can try to set up things so that BBT exports stuff that gets you the desired output; if you import that inspire-HEP bibtex, I can make it so that enough is retained so that export gets you a semantic equivalent (although REVTeX's puzzling decision to use % signs there doesn't make that particularly easy).

retorquere commented 5 years ago

I don't think it would be necessary to run a script to conform the what inspire-hep exports. BBT can export an equivalent. "..." and {...} around fields mean exactly the same.

I can detect old-style arXiv and export it with an ARXIV: prefix. That should be doable.

BenjaminDbb commented 5 years ago

There is the overleaf url

thanks for your patience. I choose 4 BibTeX and compile in the REVTEX4-1(used by PHYSICAL REVIEW series) class.

retorquere commented 5 years ago

Super. Can you make the MWE editable?

In Purrer:2015tud, why does the `SLACcitation have the old-style arXiv ID when there's primaryClass info?

BenjaminDbb commented 5 years ago

editable version: https://www.overleaf.com/4345765536fmvztnsjxqtx Instead, in Purrer:2015tud , SLACcitation is the new-style arXiv ID, and in Visser:1997hd, SLACcitation is the old-style with GR-QC/9705051

retorquere commented 5 years ago

Right; the question then becomes: why does Visser:1997hd have an old-style arXiv ID in the SLACcitation?

BenjaminDbb commented 5 years ago

arXiv update their citation identifiers in April 2007

BTW, when you search 9705051 in arXiv, it will return pic1 you must use gr-qc/9705051 as paper's ID.

but, after 2007, 1512.02248 is a unique ID for Purrer:2015tud

retorquere commented 5 years ago

That still only partially answers the question; the link you point to says:

Bibtex styles can be easily converted to support the eprint field for referring to eprints. You can add eprint entries to your bibtex database like this,

@Article{Beneke:1997hv,
     author    = "M. Beneke and G. Buchalla and I. Dunietz",
     title     = "{Mixing induced CP asymmetries in inclusive B decays}",
     journal   = "Phys. Lett.",
     volume    = "B393",
     year      = "1997",
     pages     = "132-142",
     eprint        = "hep-ph/9609357"
}
The change is backwards compatible. The eprint field is just ignored when you use a style which doesn't support eprints, and the references are formatted as normal.

For the new style arXiv identifiers (April 2007 and onwards) we recommend these bib-style extensions:

     archivePrefix = "arXiv",
     eprint        = "0707.3168",
     primaryClass  = "hep-th",

where Visser:1997hd has the new-style fields, but an old-style SLACcitation.

BenjaminDbb commented 5 years ago

arxiv_identifier update Visser:1997hd has old-style arXiv identifier -> old-style SLACcitation

both old-style fields and new-style fields can be used by Visser:1997hd . most of people use new-style fields now.

retorquere commented 5 years ago

Ah, right.

And what about the SLACcitation in Weinberg:1972kfs?

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3283 ("SLACcitation")

Install in Zotero by downloading test build 5.1.33.3283, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

In 3283, if you copy the inspireHEP BibTeX to the clipboard, an Import from clipboard will get you a Zotero reference which, when exported back out, will render the same in the MWE. I'm still trying to adjust the MWE so the SLACcitation renders. After that, it should just work, no Python script required.

retorquere commented 5 years ago

SLACcitation doesn't seem to do anything and looks to be set up explicitly so it doesn't: https://tex.stackexchange.com/a/467958/27603

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3285 ("SLACcitation gone again")

Install in Zotero by downloading test build 5.1.33.3285, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

Alright; 3285 is the final build for this I think. Pin from inspireHEP is still in, and as far as I can tell, the BibTeX BBT outputs will render exactly as it would the InspireHEP-generated BibTeX.

The regular Zotero InspireHEB importer doesn't import the arXiv data from pages like http://inspirehep.net/record/1421154/export/hx (which you may want to report here), but if you copy that BibTeX and select File / Import from clipboard, the import will be done by BBT and that will import them.

If you then export using BBT, you will get BibTeX that's semantically equivalent to what you imported; even if the generated BibTeX looks different, it will render to the same in your bibliography -- no python scripts required.

retorquere commented 5 years ago

I've added the 3285-generated BibTeX to the MWE you shared and you'll see that they render to exactly the same output.

BenjaminDbb commented 5 years ago

I test the build 5.1.33.3285, and find it couldn't generate the citation key with the old-style arXiv identifier like arXiv: hep-th/0510040, but python pyinspire.py -s physics/0401042 -b works.

and I think I should report the "fail to import InspireHEP BibTeX" to Zotero. Thank you

blip-bloop commented 5 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.1.33.3291 ("regex.exec keeps state")

Install in Zotero by downloading test build 5.1.33.3291, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 5 years ago

3291 should fix that.

It's not so much that zotero should import the bibtex on that page - they import another source on that page, it's just that it might be possible to fetch the arxiv data in addition to what they already fetch.

retorquere commented 5 years ago

If you can confirm 3291 does the job, I'll roll it into a new release.

BenjaminDbb commented 5 years ago

Yes, it works. thank you.

retorquere commented 5 years ago

Cool, will be part of the next release somewhere today or tomorrow

bulmust commented 5 years ago

Pin BibTex key from InspireHEP does not work for some references:

I tried to import from:

  1. https://inspirehep.net/record/1742302
  2. https://inspirehep.net/record/1742302/export/hx
  3. https://arxiv.org/abs/1907.00991
retorquere commented 5 years ago

How it's imported doesn't matter, that functionality works from items as they are in your library, so I'll need your to right click those items in Zotero and send a bbt debug report.

retorquere commented 5 years ago

Really can't do anything without that debug log.

retorquere commented 5 years ago

No longer a problem?

bulmust commented 5 years ago

I started Zotero with Better BibTex Debug Report: Report ID: IV57E6AE-euc

retorquere commented 5 years ago

I need you to right-click the references of interest and send a debug log from the popup menu that appears, this debug log is from the help menu.

bulmust commented 5 years ago

I am sorry, I am new with it. I tried to pin citation key from inspire (nothing happened). The following debug report is obtained from Better BibTex Debug Report window:

Report ID: WRLZDDEU-euc