sc3 / cook-convictions

The Tarbell project that generates Convicted in Cook
http://convictions.smartchicagoapps.org/
6 stars 1 forks source link

How do we identify distinct charges in the data? #41

Closed ghing closed 10 years ago

ghing commented 10 years ago

Rows in the data don't represent convictions. They represent dispositions in cases that result in convictions. This page provides the FBI definition of disposition as:

The FBI defines a disposition as an action regarded by the criminal justice system to be final. A disposition states that arrest charge(s) have been modified, dropped or reports the findings of a court.

This definition seems like a good one to describe our data. However, note that our data set only includes dispositions related to convictions. That is, it doesn't contain records for dispositions reflecting a charge being dropped.

A description of the fields in the data can be found here

We want to do analysis that lets us talk in terms of numbers of convictions. Furthermore, according to @tjakester, if someone is convicted of murdering two people, we want to record that as two separate convictions.

This is somewhat difficult because:

An example of a slice of data that seems to include multiple charges and multiple dispositions per charge can be found here.

My initial inclination is, for each case number, to group records by charge disposition date and count records for the earliest charge disposition date.

However, in this slice of the data, there is one record for the earliest charge disposition date, and then two records for each charge disposition date thereafter. There are 5 different charge disposition dates in all for this case.

Furthermore, according to @tjakester , there shouldn't be more than one count of a drug charge. However, in this slice, there appear to be two records for the statute for every disposition date. Are these two separate charges that we should count separately in our analysis or does it reflect some other situation?

How do we identify distinct charges in the data?

If we can come up with a criteria that defines distinct charges, what is the best practice for testing that the assumptions in this criteria hold.

If we can't identify distinct charges in the data, we can count distinct statutes for each case, so we can count the total number of times a person was convicted of murder, but if a person murdered two people, that would only get counted once. How would we write about or numbers in a way that makes this clear? Would not being able to count distinct counts of something like murder make any analysis too misleading or misrepresentative?

@tjakester, please leave any comments if I'm not correctly describing the assumptions we're making about the data, or describing the different scenarios.

Yana715 commented 10 years ago

I met with Angela this morning and had her take a look at the data. She said that she couldn't really answer any of the questions about how the data was compiled or how to understand the different charges for one individual.

She said that when she needs to understand the data, she talks to one of the programers who put it together, either from the Rob Boyko's office or Preckwinkle's off. She gets in touch with them and sends a sample of the data along with her questions.

ghing commented 10 years ago

It seems like there are separate sentencing and credit for time served records for the same count.

That is, for the same count, rather than combining multiple charge dispositions into one field, there are multiple rows. For example one row with a disposition of "DEF SENTENCED TO COOK CNTY DOC" and one row with a disposition of "CREDIT DEFENDANT FOR TIME SERV".

tjakester commented 10 years ago

I am pretty sure those are related to probation violations. I am checking on this......

Tracy Siska | Executive Director Chicago Justice Project | 35 E. Wacker Drive, 9th Floor | Chicago, IL 60601 Ph. (312) 564 – 5685 | Fax (312) 376 - 0162

tsiska@chicagojustice.org | www.chicagojustice.org Twitter: CJPJustProj | Facebook: Become a Fan

On Jul 18, 2014, at 10:12 AM, Geoffrey Hing notifications@github.com wrote:

It seems like there are separate sentencing and credit for time served records for the same count.

— Reply to this email directly or view it on GitHub.

ghing commented 10 years ago

II wanted to clarify my previous comment where I wrote:

It seems like there are separate sentencing and credit for time served records for the same count.

Looking at the earlier slices I made to illustrate ambiguity, it seemed like it was possible that there were multiple disposition records for the same charge on the same date and that we might be able to only count rows with certain types of dispositions (e.g. "DEF SENTENCED TO COOK CNTY DOC") to get a number of distinct counts for each case.

I wanted to check whether dispositions of "CREDIT DEFENDANT FOR TIME SERV" always happened in concert with dispositions that indicated sentencing to do time in some kind of facility. That is, if there were two "DEF SENTENCED TO * DOC" dispositions, there would be two "CREDIT DEFENDANT FOR TIME SERV" dispositions.

However, I found this case where there are two "DEF SENTENCED ILLINOIS DOC" records on the same date, for the same statute, but only one "CREDIT DEFENDANT FOR TIME SERV". The "DEF SENTENCED ILLINOIS DOC" records have different minsent values. Do these represent multiple counts of murder? If so, why would the minsent value be different and why is there only "CREDIT" disposition?

ghing commented 10 years ago

We figured out a strategy for this:

This was implemented in 7d2cbb39b73c57988a333eef2de26f385b5edaa5 and I've run the management command to create the Conviction records. Note that we'll have to re-run it when we clean more of the disposition records.