fecgov / fec-eregs

The Federal Election Commission's web-based application that makes regulations easier to find, read and understand.
https://www.fec.gov/regulations/
Other
33 stars 13 forks source link

Parsing eregs #375

Closed patphongs closed 6 years ago

patphongs commented 6 years ago

@cmc333333 Thanks for helping us get locally set up. After doing some testing, @pkfec and I attempted to pipe in title 11 part 100 as a test and got a bunch of errors like this regparser.index.dependency.Missing: Missing dependency. .eregs_index/notice_xml/2013-18542 is needed for .eregs_index/version/11/100/2013-18542.

You will most likely get the same errors if you attempt to parse it as well. This is the command we ran: eregs pipeline 11 100 http://localhost:8000/api.

The parser seems to work for some parts and may fail for others. Which causes concern because we're not sure what would happen if we attempt to parse the entire regulation. Any suggestions or clues as to what we can do to overcome these errors?

cmc333333 commented 6 years ago

Hi @patphongs, if I recall correctly, FEC is only parsing the latest annual edition of each reg. Try:

eregs clear # remove all of the cached content
eregs pipeline --only-latest 11 100 http://localhost:8000/api
eregs pipeline --only-latest 11 101 http://localhost:8000/api
...

That said, it looks like @anthonygarvan wrote a script to load all of these last year in #349. Maybe try mining that?

patphongs commented 6 years ago

Thanks @cmc333333, the --only-latest flag seemed to do the trick. We'll take a closer look at Tony's PR https://github.com/18F/fec-eregs/pull/349.

@vrajmohan It appears you were the last to review that PR. Do you happen to remember why it was never merged? Were there issues with it?

vrajmohan commented 6 years ago

There were a few small issues in the PR that were not addressed. When I last reviewed eregs for the FEC team, I discovered this and created https://github.com/18F/fec-eregs/issues/369

pkfec commented 6 years ago

@cmc333333

I am more inclined to parse one **Title

** at a time. This way i know which one is erroring out. But in the meantime i will also look at the Tony's script #349

Tried to parse Title 3 with and without the --only-latest. Got a different error this time. i cleared the eregs cache as well. But that doesnt seems to fix the error. Could you please suggest what could have gone wrong with Title 3.

command i ran: eregs pipeline --only-latest 11 3 http://localhost:8000/api eregs pipeline 11 3 http://localhost:8000/api

screen shot 2018-02-01 at 2 16 56 pm
pkfec commented 6 years ago

@cmc333333 sry, my bad. there is no Part 3 for FEC. ignore my comments.

patphongs commented 6 years ago

Oh right, so it would appear that there are 45 regulations under Title 11:

1 Privacy Act 2 Sunshine Regulations; Meetings 4 Public Records And The Freedom Of Information Act 5 Access To Public Disclosure And Media Relations Division Documents 6 Enforcement Of Nondiscrimination On The Basis Of Handicap In Programs Or Activities Conducted By The Federal Election Commission 7 Standards Of Conduct 8 Collection Of Administrative Debts 100 Scope And Definitions 101 Candidate Status And Designations 102 Registration, Organization, And Recordkeeping By Political Committees 103 Campaign Depositories 104 Reports By Political Committees And Other Persons 105 Document Filing 106 Allocations Of Candidate And Committee Activities 107 Presidential Nominating Convention, Registration And Reports 108 Filing Copies Of Reports And Statements With State Officers 109 Coordinated And Independent Expenditures 110 Contribution And Expenditure Limitations And Prohibitions 111 Compliance Procedure 112 Advisory Opinions 113 Permitted And Prohibited Uses Of Campaign Accounts 114 Corporate And Labor Organization Activity 115 Federal Contractors 116 Debts Owed By Candidates And Political Committees 200 Petitions For Rulemaking 201 Ex Parte Communications 300 Non-Federal Funds 9001 Scope 9002 Definitions 9003 Eligibility For Payments 9004 Entitlement Of Eligible Candidates To Payments; Use Of Payments 9005 Certification By Commission 9006 Reports And Recordkeeping 9007 Examinations And Audits; Repayments 9008 Federal Financing Of Presidential Nominating Conventions 9012 Unauthorized Expenditures And Contributions 9031 Scope 9032 Definitions 9033 Eligibility For Payments 9034 Entitlements 9035 Expenditure Limitations 9036 Review Of Matching Fund Submissions And Certification Of Payments By Commission 9037 Payments And Reporting 9038 Examinations And Audits 9039 Review And Investigation Authority

AmyKort commented 6 years ago

@pkfec -- is this relevant to the work you are doing now?

pkfec commented 6 years ago

@AmyKort : yeah, thats correct!