ipno-llead / US-IPNO-exonerations

Processing repo for the Innocence Project New Orleans' Louisiana Law Enforcement Accountability Database
3 stars 1 forks source link

first draft of thoughts from deep-dive #19

Closed baileyb0t closed 3 months ago

baileyb0t commented 4 months ago

I put this first draft in a separate notebook but can definitely add the cells over to the existing notebook if it looks good.

ayyubibrahimi commented 4 months ago

Yes! This is awesome.

There's one other characteristic that I'm interested in: are there any noticeable patterns in entities that aren't correctly identified?

I definitely sent you an outdated version of the tables where the false positives weren't correctly calculated. Sending an email now with updated tables.

baileyb0t commented 4 months ago

I still need to add review of the unmatched names, but I corrected the csv files and false positive comments in the notebook. I also added a reference to one of the prompts used in the model script to tie in auditing but I'll leave it to you to tie in the processing repo!

baileyb0t commented 4 months ago

I couldn't find "James Lopuis" or "Woodall" (top unmatched names) in the Exhibit doc. @ayyubibrahimi Do you have an idea of where these names appear in the PDF? I found a few of the matches so it does appear to be the correct file.

I found the top unmatched names in the other PDF and included snippets and a couple comments about those.

ayyubibrahimi commented 4 months ago

"Woodall" is on pg22.

"James Lopuis" is supposed to be "James Dupuis". I'll fix that typo in the groundtruth table and re-run the pipeline.

baileyb0t commented 4 months ago

Is "james dneps" supposed to be in the exhibit doc? Also, "James Dupuis" and "James Ducos" are both mentioned as being a photographer. Is that correct? Just want to make sure that is not a typo in the original document before I mention these two.

ayyubibrahimi commented 4 months ago

"James Dupuis" and "James Ducos" are both crime lab photographers. "Dneps" is a type/shouldn't be in the gt table. I've just sent the new output of the evaluation step where "Dneps" has been removed.

baileyb0t commented 3 months ago

There is a merge conflict for the overview notebook but I did a pull before writing in it, so there shouldn't be any meaningful changes lost!