fecgov / openFEC-web-app

DEPRECATED See https://github.com/18F/fec-cms for fec.gov's code
Other
43 stars 31 forks source link

AO pages not showing any/all regulatory and statutory citations from final opinion #2105

Closed AmyKort closed 6 years ago

AmyKort commented 7 years ago

https://www.fec.gov/data/legal/advisory-opinions/2016-26/ https://www.fec.gov/data/legal/advisory-opinions/2001-13/ https://www.fec.gov/data/legal/advisory-opinions/2002-01/ https://www.fec.gov/data/legal/advisory-opinions/2000-06/

We seem to be missing a lot of statutory and regulatory citations in the Legal Citations box. Is this unexpected?

noahmanger commented 7 years ago

@AmyKort are there any pages that are showing citations? Or does this seem like an intermittent issue? cc @vrajmohan and @anthonygarvan

vrajmohan commented 7 years ago

I will investigate. If you remember there were no citations for AOs published since 2014 as the OCRs were empty. This was fixed last week. Perhaps there are some stragglers.

AmyKort commented 7 years ago

It seems like an intermittent issue. I noted that a lot of canonical pages show regulations and no statutes, which would be highly unusual in an AO. Just browsing around, it looks like we pick up more regulatory citations than statutory citations, but we're also not picking up all of the regulatory citations. I didn't notice that 2014 to present AOs were worse off, but I can look again with that in mind.

vrajmohan commented 7 years ago

In all the 4 cases that you have listed, the poor quality of the OCR is the reason for the failure to identify statutory citations. " 26 U.S.C. §9008" is being OCR'd as "26 U.S.C. §9008". The extra character before the section symbol "§" is throwing off the parser. We could make the parser more sophisticated but that may yield some false positives.

AmyKort commented 7 years ago

Thank you for tracking that down!

noahmanger commented 7 years ago

Is there any further action needed on this?

AmyKort commented 6 years ago

resolved by https://github.com/18F/fec-cms/issues/1425