adsabs / ADSIngestParser

Curation parser library
MIT License
0 stars 7 forks source link

Add additional sources to input file "format" list #78

Open seasidesparrow opened 9 months ago

seasidesparrow commented 9 months ago

Is your feature request related to a problem? Please describe. base parser currently has a list of four formats that it can output to the document.loadFormat field: JATS, OtherXML, HTML, and Text. As part of reference output, we currently want to parse some content with a "source" of the input filetype. For the purpose of reference parsing, it would be helpful to have those types reflected in the loadFormat field, especially those that would be subsumed under the "OtherXML" value. These could include "NLM" (non-JATS), "Crossref", "Zenodo" and "DublinCore".

Describe the solution you'd like Consider adding additional input types to the base.py format def, and update the parsers to populate their own input format fields.

Additional context Add any other context or screenshots about the feature request here.