openva / crump

A parser for the Virginia State Corporation Commission's business registration records.
https://vabusinesses.org/
MIT License
20 stars 3 forks source link

Many source files are improperly encoded #116

Open waldoj opened 8 years ago

waldoj commented 8 years ago

Corp.csv, LLC.csv, Name.History.csv, and Officer.csv all claim to be UTF-8, but contain invalid characters. (They're all people's names, and I guarantee you that well over 90% of those people are black or Latino. So.) They can be found via grep -axv '.*' filename.csv.