ipno-llead / US-IPNO-exonerations

Processing repo for the Innocence Project New Orleans' Louisiana Law Enforcement Accountability Database
3 stars 1 forks source link

initial code to import and process handlabeled examples and compare t… #10

Open tarakc02 opened 1 year ago

tarakc02 commented 1 year ago

…o model outputs, calcs precision and recall by parameter settings

tarakc02 commented 1 year ago

notes:

ayyubibrahimi commented 1 year ago

@tarakc02 Sorry! Just turned off the branch protection rule that required a review before merging.

I've produced some data for the second piece of analysis. The parameters for the tables are:

I chose three queries: 1) Identify individuals, by name, with the specific titles of officers, sergeants, lieutenants, captains, detectives, homicide officers, and crime lab personnel in the transcript. Specifically, provide the context of their mention related to key events in the case, if available. 2) List individuals, by name, directly titled as officers, sergeants, lieutenants, captains, detectives, homicide units, and crime lab personnel mentioned in the transcript. Provide the context of their mention in terms of any significant decisions they made or actions they took. 3) Locate individuals, by name, directly referred to as officers, sergeants, lieutenants, captains, detectives, homicide units, and crime lab personnel in the transcript. Explain the context of their mention in relation to their interactions with other individuals in the case.

And for each document, I ran the query 6 consecutive times both with HYDE and without HYDE. Because I figured that we should be able to infer the effect with a subset of documents, I chose to produce these data for a smaller subset of documents (3 police reports/3 transcripts). Let me know if I should point you to them or if you want me to do the analysis.

ayyubibrahimi commented 1 year ago

I've uploaded data for the remaining three queries. All other parameters from above remain the same. Data can be found here.

ayyubibrahimi commented 1 year ago

I've re-run an analysis similar to the one described above for with GPT4. I chose different but similar parameters to those described above because GPT-4 has a token size 2x greater than GPT3. Based on the initial analysis that you did, these are presumably optimal for name extraction. The intent behind running this analysis with GPT4 is still to determine why we're seeing such variability in what names are extracted.

The parameters for the tables are:

model = GPT4 k = 15 chunk_size = 1000 chunk_overlap = 500 hyde = 1 or 0

I chose two queries:

And for each document, I ran the query 6 consecutive times both with HYDE and without HYDE on 3 police reports and 3 transcripts. They can be found here.