campus-crime-watch / campus-crime-watch.github.io

A journalism project: an interactive map of crime data for Stanford University and information about the Clery Act.
https://campus-crime-watch.github.io/
MIT License
0 stars 0 forks source link

finish crime_category.py to standardize crime names #23

Closed ozterz closed 1 year ago

ozterz commented 1 year ago

Proposed solution:

Create general categories of crimes and map the dirty values to the standardized values. Use a regex to fuzzy match data rows to standardized categories.

zstumgoren commented 1 year ago

Don't forget the famous saying:

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Regex is powerful, but can get complex VERY fast. There are coding strategies to combine "procedural" code with simpler regexes so you can keep your sanity. Let's talk.