Open emily878 opened 9 years ago
Hey, @beccasjames - could you help us out with a list of minimal necessary data elements?
Core elements for a particular state incarceration dataset, inmate population data, would include the following elements:
A model (or "token") dataset can be found at the California Department of Corrections and Rehabilitation (CDCR). They produce weekly and monthly population reports for both inmate and parole populations, including an extensive archive: http://www.cdcr.ca.gov/Reports_Research/Offender_Information_Services_Branch/Population_Reports.html
Further, an ideal inmate population dataset would:
As of now, I have yet to identify a state that fulfills all of these requirements. If discovered, updates will be provided.
Is it desirable, or even possible, to have identifiable, per-prisoner granularity?
Becca and I talked about that and I personally don't think we want that as our first cut at a dataset. It will increase the visibility of people's PII in a way that I think will be problematic for the project.
On Fri, Mar 20, 2015 at 4:16 PM, Waldo Jaquith notifications@github.com wrote:
Is it desirable, or even possible, to have identifiable, per-prisoner granularity?
— Reply to this email directly or view it on GitHub https://github.com/sunlightpolicy/State-Open-Data-Census/issues/38#issuecomment-84133769 .
Emily Shaw National Policy Manager | Sunlight Foundation | (o) 202-742-1520 x 282 | (c) 207-233-5684 @emilydshaw http://twitter.com/emilydshaw
Echoing Emily here, the PII shared with inmate-level micro-data is potentially problematic. A few states actually do produce extensive, machine-readable datasets with inmate-level micro-data. If you're interested in what those look like, see examples below:
Got it—thank you!
That Nebraska data is the weirdest thing. It's an Excel spreadsheet with two worksheets—one with 60,000 records, one with a suspicion-inducing 65,535—that contain just one row, with one number in each row. I feel a bit like I just bought a hard drive at Best Buy, got it home, opened the box, and found only a brick inside.
Define the essential substantive elements of the core State Incarceration dataset. What are the components that it must minimally include? Do we have a dataset that we could hold up as a model?