usnationalarchives / federalregister-api-core

Federal Register 2.0 API and Data Importer
https://www.federalregister.gov
Other
136 stars 34 forks source link

Incorrect Text In XML Tag #20

Closed Aniformer closed 4 years ago

Aniformer commented 4 years ago

Hi team,

This is to bring to your attention a discrepancy we observed while making the following API call for the following notification ID - 2019-25944 on the 06/12/2019 (6th Dec) :- https://www.federalregister.gov/api/v1/documents/2019-25944.json and by parsing the XML link for the FR document of this ID :- https://www.federalregister.gov/documents/full_text/xml/2019/12/06/2019-25944.xml (which is extracted from the "full_text_xml_url" key in the JSON output of the above mentioned API call).

Normally as observed so far for XML documents the CFR reference info, for eg - '17 CFR Part 275' appeared in the "CFR" tag of the XML document. But as seen in the above mentioned XML link the CFR info - '12 CFR Part 1005' is present in the "SUBAGY" tag.

Any insight into why this change occurred would be helpful!

Thank you, Anirudh

peregrinator commented 4 years ago

Hi Anirudh - we don't create the XML markup so I can't speak to why it may have gotten encoded that way. We are consumers of the XML as well - the Government Publishing Office generates the XML at the same time they are creating the print issue. They may be able to shed light on what happened here.

We also make use of the metadata MODS XML files - in fact that is where we extract the CFR references that are present in our API. So for the document you referenced, the MODS file is available here: https://www.govinfo.gov/metadata/granule/FR-2019-12-06/2019-25944/mods.xml, and you can see the CFR reference in the <cfr> tag.

Hope that helps!