rbshaffer / gpo_tools

Scraping and parsing tools for the GPO's congressional hearings dataset.
MIT License
11 stars 7 forks source link

Invalid Date Format Error #3

Open maxkeber opened 3 years ago

maxkeber commented 3 years ago

Hi Robert,

I'm trying to run a scrape to update my dataset to include more recent years. It's working fine for the 115th and 116th Congresses, but I'm getting an error when I re-scrape the 114th. See error reproduced here:

self.cur.execute(cmd, data) psycopg2.errors.InvalidDatetimeFormat: invalid input syntax for type date: "" LINE 33622: ','114','2','SENATE','',ARRAY['Appropriations'],'{}','https:... ^

There must be a file with an invalid date format (missing in this case, it seems) which the scraper cannot accommodate.

Please advise if you have any suggestions to get around this by either fixing the specific issue file or by allowing the scraper to have missing dates and still run through.

Thanks so much! Max