Open acheong08 opened 1 year ago
Congress is a mess and uses PDFs which are difficult to parse automatically. Wording is also not very specific and will require some basic NLP to extract information
C: financial-pdfs
P: ptr-pdfs
Only periodic transaction reports are relevant
There are some annoying dependency issues. Instructions to fix are in the tooling folder
https://disclosures-clerk.house.gov/PublicDisclosure/FinancialDisclosure
and
https://disclosures-clerk.house.gov/public_disc/ptr-pdfs/<year>/<ID>.pdf