chicago-police-violence / data

Dataset about the personnel, use of force, and complaints in the Chicago Police Department
MIT License
7 stars 0 forks source link

The "final result" #3

Closed trevorcampbell closed 3 years ago

trevorcampbell commented 3 years ago

It is a bit odd that we have both roster.csv and profiles.csv at the output. Although I'm not sure exactly what the difference is, it seems the two P0-46957_*.csv files are also overlapping.

I think we should be more "opinionated" about what we consider to be "final results" of the processing. Maybe we need to introduce a third folder to store the final results?

Thibauth commented 3 years ago

I completely understand the confusion. This is due to me being very sloppy with filenames for the later parts I have coded. I was planning to clean up this code and clarify all this, hopefully it will make sense when I am done. I think #4 is also very related.

Thibauth commented 3 years ago

See https://github.com/chicago-police-violence/data/issues/4#issuecomment-898786989 . I think once we clarify the filenames, the linked folder can serve at the final folder and I think it could even be renamed to final. It is not clear to me that we need to introduce another folder for this, since the linking is currently our final step.

trevorcampbell commented 3 years ago

After working on this repo a bit, I understand now why you have a separate roster.csv and profiles.csv file (an officer shows up in different records with different field values, e.g. if their name or rank changes over time). I wonder if it makes sense just to output the profiles.csv file... we can chat about this in-person and then record the outcome of the discussion here for future reference.

Thibauth commented 3 years ago

I think we finally have a reasonable looking "final" folder. I think we can output both roster.csv and profiles.csv as long as the documentation is very clear about what is contained in profiles.csv (and emphasizing that in most cases, roster.csv is all you need). What do you think? Feel free to close this issue, unless you believe there is more to be done about it.

Thibauth commented 3 years ago

Closing this for now. The final folder looks fine to me now, but we can reopen an issue for the more specific question of roster.csv vs profiles.csv.