HEPData / hepdata

Repository for main HEPData web application
https://hepdata.net
GNU General Public License v2.0
40 stars 11 forks source link

records: implement content negotiation of records in BibTeX format #755

Open GraemeWatt opened 8 months ago

GraemeWatt commented 8 months ago

Question from @jackaraz:

Is there an API that we can download BibTeX entries for submitted analysis? When users use a certain analysis, I want to create a bib file so that they can properly cite everything they used; in InspireHEP, they have something similar. For instance, I can retrieve the bibtex entry like this:

import requests
import textwrap

headers = {"accept": "application/x-bibtex"}
response = requests.get(
    "https://inspirehep.net/api/arxiv/2303.03427", headers=headers, timeout=5
)

response.encoding = "utf-8"
print(textwrap.indent(response.text, " " * 4))

I was wondering if HEPData has something similar.

The best implementation might be in terms of content negotiation as for the JSON-LD metadata format, i.e. BibTeX format could be returned for a record page with an HTTP header Accept: application/x-bibtex. This might require some refactoring of the HEPData code. Currently, the BibTeX format is written in cite-widget.html and is accessible only when a user clicks the "Cite" button in the top-right of a record page followed by selecting the "BibTeX" tab of the "Citing this record" widget.

In the meantime, it is fairly easy to scrape the HTML of a record page to get the BibTeX format. For example, using the Python requests and beautifulsoup4 packages:

import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.hepdata.net/record/ins1750597', headers={'Accept': 'text/html'})
soup = BeautifulSoup(r.text, 'html.parser')
bibtex = soup.find(id='record_bibtex').string
print(bibtex)

gives:

@misc{hepdata.89413,
    author = "{ATLAS Collaboration}",
    title = "{Search for electroweak production of charginos and sleptons decaying into final states with two leptons and missing transverse momentum in $\sqrt{s}=13$ TeV $pp$ collisions using the ATLAS detector}",
    howpublished = "{HEPData (collection)}",
    year = 2022,
    note = "\url{https://doi.org/10.17182/hepdata.89413}"
}
jackaraz commented 8 months ago

Thats great! Thanks @GraemeWatt ! I'm assuming ins1750597 is the Inspire ID?

GraemeWatt commented 8 months ago

I'm assuming ins1750597 is the Inspire ID?

Yes, https://www.hepdata.net/record/ins1750597 corresponds to https://inspirehep.net/literature/1750597 .

https://www.hepdata.net/record/89413 or https://doi.org/10.17182/hepdata.89413 also work (but not the arXiv identifier).