retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.34k stars 287 forks source link

Enhancement for NASA ADS exports #1733

Closed HNLala closed 3 years ago

HNLala commented 3 years ago

Debug ID G3D4N9ZA-euc

I am trying to export citations created off NASA ADS commonly used by astronomers. For e.g., see https://ui.adsabs.harvard.edu/abs/2004A%26A...424..927C/exportcitation. I have included this citation format in the expected behavior.

The BBT-generated citation key is quite handy and is one of the main reasons I am using BBT. However, I'd like to retain a few of the original fields as they are. In particular, instead of exporting url as 'url', exporting it as 'adsurl'. Additionally, I would like to keep the fields archivePrefix, eprint, primaryClass. Is there an option already present which will allow me to do that? Or will it be possible for you to implement such a feature?

Many astronomical journals link both the DOI (to the official published version of the article) and the url (linked to ADS). However, the latter is recognized only if it is named 'adsurl' and not 'url'.

Thanks a lot for this immensely helpful tool.

Desired behavior:

@ARTICLE{2004A&A...424..927C,
       author = {{Caputo}, F. and {Castellani}, V. and {Degl'Innocenti}, S. and {Fiorentino}, G. and {Marconi}, M.},
        title = "{Bright metal-poor variables: Why ``Anomalous'' Cepheids?}",
      journal = {\aap},
     keywords = {stars: variables: Cepheids, stars: evolution, Astrophysics},
         year = 2004,
        month = sep,
       volume = {424},
        pages = {927-934},
          doi = {10.1051/0004-6361:20040307},
archivePrefix = {arXiv},
       eprint = {astro-ph/0405395},
 primaryClass = {astro-ph},
       adsurl = {https://ui.adsabs.harvard.edu/abs/2004A&A...424..927C},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

Actual behavior:

@article{caputo2004_BrightMetalpoorVariables,
  title = {Bright Metal-Poor Variables: {{Why}} ``{{Anomalous}}'' {{Cepheids}}?},
  shorttitle = {Bright Metal-Poor Variables},
  author = {Caputo, F. and Castellani, V. and Degl'Innocenti, S. and Fiorentino, G. and Marconi, M.},
  year = {2004},
  month = sep,
  volume = {424},
  pages = {927--934},
  issn = {0004-6361},
  doi = {10.1051/0004-6361:20040307},
  url = {http://adsabs.harvard.edu/abs/2004A%26A...424..927C},
  urldate = {2021-02-01},
  journal = {Astronomy and Astrophysics}
}
label-gun[bot] commented 3 years ago

It looks like you did not upload an debug report. The debug report is important; it gives @retorquere your current BBT settings and a copy of the problematic reference as a test case so he can best replicate your problem. Without it, @retorquere is effectively blind. Debug reports are useful for both bug analysis and enhancement requests; in the case of export enhancements, I need the copy of the references you have in mind.

If you did try to submit a debug report, but the ID looked like D<number>, that is a Zotero debug report, which I cannot access. Please re-submit a BBT debug log by one of the methods below.

This request is much more likely than not to apply to you, too, even if you think it unlikely, and even if it does not, there's no harm in sending a debug log that turns out to be unnecessary. @retorquere will more often than not just end up saying "please send a debug log first". Let's just skip over the unnecesary delay this entails. Sending a debug log is very easy:

  1. If your issue relates to how BBT behaves around a specific reference(s), such as citekey generation or export, select at least one of the problematic reference(s), right-click it, and submit an BBT debug report from that popup menu. If the problem is with export, please do include a sample of what you see exported, and what you expected to see exported for these references.

  2. If the issue does not relate to references and is of a more general nature, generate an debug report by restarting Zotero with debugging enabled (Help -> Debug Output Logging -> Restart with logging enabled), reproducing your problem, and selecting "Send Better BibTeX debug report..." from the help menu.

Once done, you will see a debug ID in red. Please post that debug id in the issue here.

Thank you!

retorquere commented 3 years ago

debug log. please.

github-actions[bot] commented 3 years ago

:robot: this is your friendly neighborhood build bot announcing test build 5.2.110.338 ("cleanup")

Install in Zotero by downloading test build 5.2.110.338, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 3 years ago

The adsurl can be handled with a postscript:

if (Translator.BetterTeX && reference.has.url && reference.has.url.value.includes('adsabs.harvard.edu')) {
  reference.has.url.name = 'adsurl'
}

WRT the arXiv stuff, G3D4N9ZA-euc doesn't have that info, so I can't "keep" what's not in your database. But if you import that sample above, BBT will put the following in the extra field:

arXiv: astro-ph/0405395 [astro-ph]
tex.adsnote: Provided by the SAO/NASA Astrophysics Data System
tex.adsurl: https://ui.adsabs.harvard.edu/abs/2004A&A...424..927C

and those will be exported back out to their respective biblatex fields on re-export. You don't have to re-import of course, you can just add these lines to the extra field yourself.

If you want to import the \aap command in a way that it will be re-exported, you can test with build 338 above; you'll have to set the hidden preference importUnknownTexCommand to tex before import. I may make this an exposed setting, but I've added it just now.

HNLala commented 3 years ago

Debug Log WYLFJJ9R-euc

Thank you for the detailed reply, I have followed what you recommended. However, I noticed a difference in how the exports work based on how I was importing the reference. Maybe I should report it to Zotero devs. Anyway, details are:

Case A: If I use the Zotero connector and save the reference using the Firefox addon on this page (https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J/abstract), BBT exports the following citation:


@article{ji2020_SouthernStellarStream,
  title = {The {{Southern Stellar Stream Spectroscopic Survey}} ({{S5}}): {{Chemical Abundances}} of {{Seven Stellar Streams}}},
  shorttitle = {The {{Southern Stellar Stream Spectroscopic Survey}} ({{S5}})},
  author = {Ji, Alexander P. and Li, Ting S. and Hansen, Terese T. and Casey, Andrew R. and Koposov, Sergey E. and Pace, Andrew B. and Mackey, Dougal and Lewis, Geraint F. and Simpson, Jeffrey D. and {Bland-Hawthorn}, Joss and Cullinane, Lara R. and Da Costa, Gary. S. and Hattori, Kohei and Martell, Sarah L. and Kuehn, Kyler and Erkal, Denis and Shipp, Nora and Wan, Zhen and Zucker, Daniel B.},
  year = {2020},
  month = oct,
  volume = {160},
  pages = {181},
  issn = {0004-6256},
  doi = {10.3847/1538-3881/abacb6},
  url = {http://adsabs.harvard.edu/abs/2020AJ....160..181J},
  urldate = {2021-02-03},
  journal = {The Astronomical Journal}
}

On the other hand, if I copy the citation from the clipboard in the format described here: Case B:

@ARTICLE{2020AJ....160..181J,
       author = {{Ji}, Alexander P. and {Li}, Ting S. and {Hansen}, Terese T. and {Casey}, Andrew R. and {Koposov}, Sergey E. and {Pace}, Andrew B. and {Mackey}, Dougal and {Lewis}, Geraint F. and {Simpson}, Jeffrey D. and {Bland-Hawthorn}, Joss and {Cullinane}, Lara R. and {Da Costa}, Gary. S. and {Hattori}, Kohei and {Martell}, Sarah L. and {Kuehn}, Kyler and {Erkal}, Denis and {Shipp}, Nora and {Wan}, Zhen and {Zucker}, Daniel B.},
        title = "{The Southern Stellar Stream Spectroscopic Survey (S$^{5}$): Chemical Abundances of Seven Stellar Streams}",
      journal = {\aj},
     keywords = {Globular star clusters, Stellar abundances, Dwarf galaxies, Milky Way stellar halo, 656, 1577, 416, 1060, Astrophysics - Solar and Stellar Astrophysics, Astrophysics - Astrophysics of Galaxies},
         year = 2020,
        month = oct,
       volume = {160},
       number = {4},
          eid = {181},
        pages = {181},
          doi = {10.3847/1538-3881/abacb6},
archivePrefix = {arXiv},
       eprint = {2008.07568},
 primaryClass = {astro-ph.SR},
       adsurl = {https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J},
      adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

This format is copied to the clipboard from: https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J/exportcitation

In this case, BBT pins the incoming citekey automatically (which might be something you can help me with) but the rest of the keywords are transferred as expected. Even the unknown tex command of the journal abbrev. works well now. The BBT export is:

@article{2020AJ....160..181J,
  title = {The Southern Stellar Stream Spectroscopic Survey (S{$^5$}): {{Chemical}} Abundances of Seven Stellar Streams},
  author = {Ji, Alexander P. and Li, Ting S. and Hansen, Terese T. and Casey, Andrew R. and Koposov, Sergey E. and Pace, Andrew B. and Mackey, Dougal and Lewis, Geraint F. and Simpson, Jeffrey D. and {Bland-Hawthorn}, Joss and Cullinane, Lara R. and Da Costa, Gary. S. and Hattori, Kohei and Martell, Sarah L. and Kuehn, Kyler and Erkal, Denis and Shipp, Nora and Wan, Zhen and Zucker, Daniel B.},
  year = {2020},
  month = oct,
  volume = {160},
  pages = {181},
  doi = {10.3847/1538-3881/abacb6},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System},
  adsurl = {https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J},
  archivePrefix = {arXiv},
  eid = {181},
  eprint = {2008.07568},
  eprinttype = {arxiv},
  journal = {\aj},
  number = {4},
  primaryClass = {astro-ph.SR}
}

I understand if this might be too idealistic, but what I hope can be possible is:

  1. I use the Zotero Connector on the page: https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J/abstract and it creates a new entry for this reference and attaches the related PDF.
  2. It recovers all the fields listed here (in the 'Export Citation' section, that is): https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J/exportcitation
  3. BBT generates its own citekey
  4. The exported citation looks like this:
@article{ji2020_SouthernStellarStream,
  title = {The Southern Stellar Stream Spectroscopic Survey (S{$^5$}): {{Chemical}} Abundances of Seven Stellar Streams},
  author = {Ji, Alexander P. and Li, Ting S. and Hansen, Terese T. and Casey, Andrew R. and Koposov, Sergey E. and Pace, Andrew B. and Mackey, Dougal and Lewis, Geraint F. and Simpson, Jeffrey D. and {Bland-Hawthorn}, Joss and Cullinane, Lara R. and Da Costa, Gary. S. and Hattori, Kohei and Martell, Sarah L. and Kuehn, Kyler and Erkal, Denis and Shipp, Nora and Wan, Zhen and Zucker, Daniel B.},
  year = {2020},
  month = oct,
  volume = {160},
  pages = {181},
  doi = {10.3847/1538-3881/abacb6},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System},
  adsurl = {https://ui.adsabs.harvard.edu/abs/2020AJ....160..181J},
  archivePrefix = {arXiv},
  eid = {181},
  eprint = {2008.07568},
  eprinttype = {arxiv},
  journal = {\aj},
  number = {4},
  primaryClass = {astro-ph.SR}
}

In the desired format, every field except the citekey is identical to the incoming citation. Instead of importing the fields in its usual way, it will be great if Zotero lifts these fields verbatim from the 'Export Citation' section.

retorquere commented 3 years ago

Even the unknown tex command of the journal abbrev. works well now.

It's also possible to have these expanded into their full names BTW; in the advanced BBT prefs, there's an @strings field where you can add something like

@preamble{
\newcommand{\aj}{Full Journal Name}
}

this mechanism is pretty primitive; only simple commands without parameters can be defined this way.

Zotero doesn't notify other parts of the code that a scrape is ongoing; the new item just appears in the database. For this to work, I'd have to patch the scraping process, detect that this specific site is being scraped, download the bibtex, etc... essentially duplicating Zotero's scrape facility, with site-specific scraping code on top of this to make it work. That is not something I want to take on.

  • BBT generates its own citekey

You can disable citekey import: https://retorque.re/zotero-better-bibtex/installation/preferences/hidden-preferences/#importcitationkey

In the desired format, every field except the citekey is identical to the incoming citation. Instead of importing the fields in its usual way, it will be great if Zotero lifts these fields verbatim from the 'Export Citation' section.

I know that Zotero has a priority order for their scrapers when multiple formats are available -- if this is also the case for the ADS site, you may be able to give bibtex import precedence either automatically or manually. I haven't used this in many years, but I can't imagine this would have been removed. You'd have to ask about this on the Zotero forums, they'll know better.

retorquere commented 3 years ago

Does this address the problem you have?

retorquere commented 3 years ago

I'm going to assume your problem has been addressed.

HNLala commented 3 years ago

Sorry for the delay. Your solutions worked to a T. Thanks.

retorquere commented 3 years ago

Super, thanks for the confirmation. I'll close the issue when I've merged it into a release, I keep these open a a reminder to do so.