data-liberation-project / aphis-inspection-reports

Inspection data and PDFs from the USDA's Animal and Plant Health Inspection Service.
13 stars 3 forks source link

Add species list to parsed and combined data #37

Closed jsvine closed 1 year ago

jsvine commented 1 year ago

Adds two new keys to each inspection's parsed JSON, based on the "Species Inspected" page(s) at the end of each report: animals_total: int, and species: list[{ "count": int, "scientific": str, "common": str }]

The data/combined/inspections.csv file gains a pdf_animals_total field, and a new file is added: data/combined/inspections-species.csv, which contains those species-level counts and can be linked back to the main CSV via hash_id.

The species-level counts are validated against the total counts; all pass this test.