codeforamerica / classifyr

A tool for aggregating and crowd-sourcing the classification emergency call data
MIT License
0 stars 1 forks source link

Introduce a new table to specifically store call types into. #60

Open T-Dnzt opened 2 years ago

T-Dnzt commented 2 years ago

When working on the recent classifications MVP stories, I’ve reused the original implementation and tried not to introduce too many new data representations to keep things simple:

-> A data set has many fields and each field has many unique values. -> The field we care the most about in a data set is the one mapped to Detailed Call Type because its unique values are the ones we want to match with common incident types. So each classification links a unique value with a common incident type. -> To add the classification logic (max number of classifications, approval, etc.), I have extended the UniqueValue model since it made sense at the time and it works fine for the MVP.

But after giving this more thought, I've realized that for most unique values, all those recently added columns (classifications_count, approved_at, etc.) will never be used and will remain nil (since their associated field wasn't mapped to Detailed Call Type).

So I've started to wonder if we should create a new table to "copy" the unique values (where unique_value.field == "Detailed Call Type"), which could simply be named call_types. We could have a unique_value_id in that call_types table to easily reference the original data and the classification process would only deal with that new model / table.

This would offer a clear separation between how data sets are stored (data sets -> fields -> unique_values) and how data are classified (call_types -> classifications).

Let me know if you have any questions or comments.