data-liberation-project / aphis-inspection-reports

Inspection data and PDFs from the USDA's Animal and Plant Health Inspection Service.
13 stars 3 forks source link

Incorrect report PDF links in APHIS results #23

Open jsvine opened 1 year ago

jsvine commented 1 year ago

While doing some QA, came across this inspection record:

... which links to https://aphis--c.na21.content.force.com/sfc/dist/version/download/?oid=00Dt0000000GyZH&ids=068t000000Yy5wD&d=%2Fa%2Ft0000001QdMH%2FhFNYsuUlrOyfMd48o4Xb08MJ7bMZ3X5fDoVvnEwVVt8&asPdf=false

... but that PDF is for a different inspection of a different organization entirely. (One that we already have in our dataset and PDFs, albeit linked through a different URL.):

Once we start parsing more info from the PDFs, we'll have a better way to catch such issues, but just flagging this for now, as something we'll want to keep an eye on and may need APHIS to fix.

jsvine commented 1 year ago

I've now conducted a more comprehensive attempt to identify these instances, thanks to the new file here. I've found a couple dozen where either the certificate number or inspection date differs. Also, this probably seems not to be distributed randomly — in many instances, same certified entity has been assigned the wrong PDF multiple times:

jsvine commented 1 year ago

I've now asked APHIS about this, will update here if/when I get an answer.