usnistgov / NIST-Tech-Pubs

XML metadata for NIST Technical Series Publications
https://pages.nist.gov/NIST-Tech-Pubs/
18 stars 8 forks source link

Python bibtex #32

Closed simsong closed 1 year ago

simsong commented 1 year ago

This is my initial effort at converting the allrecords.xml file into a BiBTeX file. Right now the conversion needs to be run manually, but I will create a GitHub action later. Getting this out is a priority.

Please let me know what you think.

ronaldtse commented 1 year ago

@simsong this will be extremely helpful for TeX users -- thank you!

ronaldtse commented 1 year ago

@simsong just in case you've not seen it, there is already a BibTeX generation XSLT provided in this repo: https://github.com/usnistgov/NIST-Tech-Pubs/blob/nist-pages/xslt/techpubs2bibtex-rev-github.xsl

Perhaps that stylesheet is not usable for some reason?

simsong commented 1 year ago

@simsong just in case you've not seen it, there is already a BibTeX generation XSLT provided in this repo: https://github.com/usnistgov/NIST-Tech-Pubs/blob/nist-pages/xslt/techpubs2bibtex-rev-github.xsl

Perhaps that stylesheet is not usable for some reason?

I did not see it. The BiBTeX it is generating is not correctly escaped.

@kmiller621 wrote to me in a private email:

I am pretty sure there is a script that converts RIS to bibtex files, but I’m not sure how to get it to work with what I’ve got.

I was trying to create an xslt to transform my XML to bibtex but couldn’t get it to work properly. But, now I’m thinking the RIS might be the files to work with. Any ideas?

For me, it was much easier to write this in Python than to try to figure out what is going on with xslt.

simsong commented 1 year ago

Oh, I should add that I now have a GitHub action that automatically runs the python on each push, so that you don't need to have the bibtex file in the repo itself. This is probably the way that the RIS files should be created also. Check it out:

The BiBTeX file is here: https://github.com/simsong/NIST-Tech-Pubs/suites/7390405553/artifacts/300857221

I think that I need to rename assets-for-download but I'm not quite sure how to do that yet.

ronaldtse commented 1 year ago

Aha, that makes sense. Thanks @simsong !

Oh, I should add that I now have a GitHub action that automatically runs the python on each push, so that you don't need to have the bibtex file in the repo itself. This is probably the way that the RIS files should be created also.

Fully agree. It's best to separate the source data (the XML file) and the derived data (RIS, BibTeX, ...).

The best way to do instead of using GitHub Actions artifacts (by default they go away in X days) is to use the Releases feature that works on git tags (versions). The official-unofficial action to use is this:

In fact, the IETF generates its NIST bibliographic repository in Relaton YAML format using this workflow:

This Relaton YAML format is used for citing NIST documents in Metanorma, which also has a private NIST version supported by the CSD.

simsong commented 1 year ago

Thanks. I'll look into the releases.