Closed terryf82 closed 5 years ago
Thanks for creating the issue @terryf82 !
So what I'm looking for here is an explanation of the features in the model and the possible values they can have (which is especially important for categorical variables since those encodings may not be clear). Specifically what I'm looking for is a table with the following:
(The reason I'm asking for this is 1- I'd like to have an understanding of what our model is taking into consideration when generating its predictions and 2- I need to know how to translate these feature names into something more human-friendly for the interpretability part of the viz)
Link to google doc: https://docs.google.com/document/d/1PwA07OfSD5ELy0pPTb4ieDDKO2syrBojA_J_JRLkhk4/edit
Currently running model locally to make sure I cover everything
I am wondering if our schemas are not the appropriate place to store this type of information? Most of our data inputs / outputs have a schema already (crashes, concerns, predictions) or one is in the works / planned (point based features, segments etc.)
The predictions schema @ https://github.com/Data4Democracy/crash-model/blob/data_standards/standards/predictions-schema.json is still only in draft form and will need updating based on the work we're doing at the moment, but as an example it provides most of the features @alicefeng has mentioned as desirable:
as well as a structured way to specify data type and enumerate allowed values where applicable.
Seems to me like we already have the right tool for the job, what do others think?
@j-t-t @bpben
I think the document can serve as a temporary point of reference until we make our schemas updated :)
@shreyapandit to start migrating the Google Docs content into a markdown file in the repo.
Let me know if you need a hand.
First pass of markdown is here: https://github.com/Data4Democracy/crash-model/pull/227
Refining some of the sources for features since our model style changed recently.
@alicefeng could you add a short description explaining the scope of this task as I'm not sure I fully understand the application of it.
@shreyapandit is this something you have started on already? If so feel free to add any relevant details, thanks.