Closed sixlettervariables closed 7 years ago
Hey @sixlettervariables , we're so glad to see you trying this out! I've got some good news and some bad news. The bad news is that this regulation has several concepts and fiddley bits we haven't encountered before; I'll create issues for the ones I've found. The good news is that there are some minor tweaks that can be made to the XML file input that will massage it into a format we do understand; even better, I have those tweaks as a patch for you ;)
First, run:
$ eregs clear
$ eregs pipeline 10 50 outdir --only-latest
This will
The crashing is okay in this case; it'll have downloaded and preprocessed a version of the regulation as XML. Apply this patch to it and you'll get something the parser can process. For the most part, that patch just moves around text, giving better hints to the parser around subparagraphs and the like; however, I did delete a few tables to kick the parser a little harder.
Once that patch has been applied to .eregs_index/annual/10/50/2015
, re-run the pipeline
command. With any luck, it'll pickup the modifications and spit out a working set of JSON files.
Let us know how this goes and if we can help further!
Outstanding, thank you! I'll try this when I get home tonight.
The bad news is that this regulation has several concepts and fiddley bits we haven't encountered before...
Welcome to the Nuclear industry!
Hey @sixlettervariables, did you make any headway here?
I'm working through access to 10 CFR 50 under the eRegs format, and ran into failures during parsing (not unheard of I'm sure). However, I'm sort of stuck as to where the problem may lie based on the errors I've received.
I began with
pipeline
:I checked to see if maybe 95-17723 did not exist, but I found it on their website so I'm not sure if this is a parsing failure or a failure in a given file to properly reference 95-17723.
As suggested I then ran
notice_order
to see if I could better target the source of the error:Now this points to possibly a different culprit, so I'm not exactly sure how to proceed.