usgpo / bulk-data

User Guides for XML on the govinfo Bulk Data Repository. For information about Bill Status XML Bulk Data, see https://github.com/usgpo/bill-status.
https://www.govinfo.gov/bulkdata
272 stars 100 forks source link

CFR 2019 Title-3 chapters misplaced #57

Open gunnerwholelife opened 4 years ago

gunnerwholelife commented 4 years ago

I am looking to parse the CFRs and break them down in a tree like structure using the bulk CFR data.

2019 title 3 xml vol 7 - In all the other title xmls, all the chapter/subchapter/section data is within the title tag but in case of Title-3, it's misplaced. The title tag contains only the following information -

<TITLE>
    <LRH>Title 3—The President</LRH>
    <RRH>Proclamations</RRH>
</TITLE>

All other titles (eg Title-1) -

<TITLE>
     <LRH>1 CFR Ch. I (1-1-19 Edition)</LRH>
     <RRH>Admin. Comm. of the Federal Register</RRH>
     <CFRTITLE>
           <TITLEHD>
               <PRTPAGE P="1"/>
               <HD SOURCE="HED">Title 1—General Provisions</HD>
           </TITLEHD>
           <CFRTOC>...</CFRTOC>
      </CFRTITLE>
      <CHAPTER>...</CHAPTER>
      <CHAPTER>...</CHAPTER>
      <CHAPTER>...</CHAPTER>
      <CHAPTER>...</CHAPTER>
      <CHAPTER>...</CHAPTER>
      <CHAPTER>...</CHAPTER>
</TITLE>

Here's the concerned docs on govinfo/app for your reference.

How do I parse this correctly ?

Thanks.

gunnerwholelife commented 4 years ago

Hey, Any updates on this ?