Open colinxfleming opened 6 years ago
Should include an example row:
Lou-000444,20100809,ODONNELL,APRIL,White,21,Female,Louisville,KY,38.1841416,-85.605567,Closed by arrest
Fixed in my #6 pull request. I couldn't get pandas to read the data so I found several and deleted the invalid characters. Pandas reads them fine now
Pandas reports the line where it has the problem reading the data. I used Notepad++ and turned on the feature to allow me to see the special characters and changed them.I remember it being a problem with only a few rows in the data. I checked in the changes in my clone of the data here https://github.com/y2kbowen/data-homicides. You can see the lines that have problems and the changes I made here https://github.com/y2kbowen/data-homicides/commit/99294e0db933fc1b2914549420654d2827df9ccd
I hope this helps
KB
On Mon, Apr 27, 2020 at 6:24 PM msmith2024 notifications@github.com wrote:
I am unable to read in pandas, how did you correct the problem?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/washingtonpost/data-homicides/issues/4#issuecomment-620287164, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGCS6CIZZSRTFI52YYUTO3ROYH4NANCNFSM4FEOFYCQ .
hey folks! I was loading this into postgres to poke at and ran into some errors - my copy statement choked on lines with non-UTF8 characters in them, such as L31119 in
homicides-data.csv
. I was able to work around it no problem, but figured I'd pay it forward and check --I wanted to ask whether it would be helpful to convert these characters to something UTF friendly. Feel free to close this issue if you all would rather not; if that would be helpful, please let me know and I'll spin up a PR for it.
Thanks again for making this data public!