Closed ataki closed 10 years ago
commit 3b964d0a182d5dafd5fbb457a579a18b9f020298 fixes this
This is bad... We'll need to look at the docs for each year and write mappings.
Yeah, this takes A WHILE. This discourages us from using super old data, which I don't mind. I think the farthest back we should go is 4 years, and we have already 60K+ rows from 2009 / 2010 alone, which should be good enough for this milestone.
tagging @scottcheng @petousis so you guys can see this.
Discovered this bug earlier tonight; kind of painstaking to have to fix it.
Basically, the 2009 and 2010 datasets aren't exactly the same; a few fields are missing from 2009 which are present in 2010, and this causes errors in field translations.
As an example: field index (857-859) means revenue from medicare in 2010 but is actually (856-858) in 2009.
This only happens starting at around index 800, so fields before that are still ok.
As a result, I need to correct
mapping.py
:frowning: However, this only applies to feature selection, and should not block you guys; just wanted to make you aware in case you were starting to do FS.This will happen by early tomorrow afternoon.