usgpo / bulk-data

User Guides for XML on the govinfo Bulk Data Repository. For information about Bill Status XML Bulk Data, see https://github.com/usgpo/bill-status.
https://www.govinfo.gov/bulkdata
266 stars 97 forks source link

Misplaced Appendix in CFRs #55

Open gunnerwholelife opened 4 years ago

gunnerwholelife commented 4 years ago

I am looking to parse the CFRs and break them down in a tree like structure using the bulk CFR data.

I came across various instances where the provided xml structure doesn't match up with the one presented on govinfo/app

For eg : - 2019 title 2 xml - Please refer to attached pic -

title-2-xml-appendix-issue

In the above xml, all the Appendix nodes fall under Subpart F - Audit Requirements but on the govinfo/app source they fall under Part 200 - UNIFORM ADMINISTRATIVE REQUIREMENTS, COST PRINCIPLES, AND AUDIT REQUIREMENTS FOR FEDERAL AWARDS

Here's an example Appendix on govinfo/app for your reference.

I have noticed this issue throughout all the titles.

Is this a bug ? How do I parse this correctly ?

Thanks.

jonquandt commented 4 years ago

@gunnerwholelife - thanks for bringing this to our attention. We're looking at this now.

If you have a list of examples, that would help us to identify and resolve the issue.

gunnerwholelife commented 4 years ago

Here's few of the examples ->

All appendix within Title-2 Subtitle A Chapter II Part 200 are misplaced to Title-2 Subtitle A Chapter II Part 200 Subpart F - Audit Requirements

Appendix - Title-2 Subtitle A Chapter I Part 170 Appendix a to 170 is misplaced to Title-2 Subtitle A Chapter I Subpart C

All Appendix within Title-12 Chapter II SubChp A Part 208 are misplaced to Title-12 Chapter II SubChp A Part 208 Subpart J

There are possibly more examples and I will keep adding them here as I come across them. I hope the above example structure works for you. Let me know if you want them in a certain format.

I am also certain this misplaced data goes back in previous years of the CFRs too. Will it be possible to patch the historic data too ?

jonquandt commented 4 years ago

@gunnerwholelife -- I think those references make sense to me. We'll take this back. I'm not sure how far back we will be able to go. Will learn more as we dig into this further.

gunnerwholelife commented 4 years ago

Hey, Any updates on this ?