Closed vrajmohan closed 8 years ago
Questions:
This looks good!
Looks like test failures are integration tests for FEC and EPA, which probably both have runs of > 3 reserved sections. Would you mind verifying the data on FEC and I'll take a look at EPA?
- Is there ever a span of sections that are not RESERVED?
Good question -- we haven't seen any that weren't reserved across the four eRegs agencies, so I think that's a safe bet now.
- Should the span indicator be explicit rather than implicit in the title?
I don't follow; would you mind giving an example?
I'll verify the integration test failures for FEC.
Should the span indicator be explicit rather than implicit in the title?
In https://github.com/eregs/regulations-parser/blob/767aa18e7f5d8297c6ddc268ab079d8f37dcc09a/regparser/tree/xml_parser/reg_text.py#L256, we already know that we are dealing with sections and/or section spans. We don't indicate this information explicitly in the persisted output. It is implicit in the label
and the title
. In https://github.com/eregs/regulations-site/blob/23fd2ab001782a9654e26496aa96870d16b71614/regulations/generator/title_parsing.py#L40, we extract this information again using similar logic.
@vrajmohan gotcha. I think they're working slightly different angles. The build_from_section
logic is meant to only concern itself with the creation of Node
objects, so it cares about spans of sections only to the extent that it needs to create multiple Node
s or not. We're doing a bit of reformatting there, but I think that's more out of convenience than cleanliness.
I don't think that section-spanning data is very important to Node
s; but it is important to the table-of-contents layer. I'd argue that it should be explicit there. Unfortunately the relevant logic is actually within -site :(. We run into a similar problem in the N&C code -- it'd be ideal if the parser was doing the parsing rather than the UI.
I can confirm that the only changes to EPA are expected -- several runs of sections are merged into a single "reserved" section.
Likewise, I can confirm that the only changes to FEC are expected -- several runs of sections are merged into a single "reserved" section.
Looks good, test failures make sense, and @vrajmohan assures this is no longer a WIP
Identify these spans by the starting section number. Update the section title to indicate the sections in the span.