Partially addresses issue #37: this integrates @bertamb's court summary parsing functions into download.py such that it creates two .csv files (one for each the docket files and the court summaries).
This PR does not update daily_docket_pull.yml, or whatever workflow we want to use this in, to save the court summary .csv file to the bucket @adamrlinder has created.
Addresses issue #55: this parametrizes download.py such that it can take either a single docket number or the AWS Athena ID-key pair.
Currently, testing this in the command line as python download.py --docket DOCKETNO works up until it hits the currently unresolved issue #63, but terminates gracefully in this case, creating two empty .csv files and outputting that the docket number could not be processed.
parse_court.py on its own still works fine: if you put some test PDF files in tmp\courts, it'll produce a .csv file in that directory with the parsed results.
download.py
such that it creates two .csv files (one for each the docket files and the court summaries).daily_docket_pull.yml
, or whatever workflow we want to use this in, to save the court summary .csv file to the bucket @adamrlinder has created.download.py
such that it can take either a single docket number or the AWS Athena ID-key pair.python download.py --docket DOCKETNO
works up until it hits the currently unresolved issue #63, but terminates gracefully in this case, creating two empty .csv files and outputting that the docket number could not be processed.parse_court.py
on its own still works fine: if you put some test PDF files intmp\courts
, it'll produce a .csv file in that directory with the parsed results.