In download.py, since we can retrieve the information from both the court summary file and the docket file corresponding to the docket number at the same time, we could merge the dictionaries containing the parsed court summary and docket data for a single docket number and return that as our entry for a docket. Then only a single .csv file is created, which has both the original docket info and sex+race already linked together to the same docket number. It seems to me that it would streamline our analysis process down the line, versus having information in two separate S3 buckets that we have to stitch together in the analysis phase.
In
download.py
, since we can retrieve the information from both the court summary file and the docket file corresponding to the docket number at the same time, we could merge the dictionaries containing the parsed court summary and docket data for a single docket number and return that as our entry for a docket. Then only a single .csv file is created, which has both the original docket info and sex+race already linked together to the same docket number. It seems to me that it would streamline our analysis process down the line, versus having information in two separate S3 buckets that we have to stitch together in the analysis phase.