Closed amzoss closed 4 years ago
Should we do a quantifiable scale, like 1-3, 1 being unprocessed raw data, 3 being processed and ready to use and publish, 2 being something in the middle? not sure what though.
But different raw data will be differently clean, so it might be worth being a bit more specific? like, raw data from a survey where people were hand typing in their company names could have very messy text data, but raw data from a sensor that always outputs exactly the same thing could be very clean.
It seems a bit too hard to create a coding scheme to capture this, and another free-text question will make the survey feel even longer. If messiness is part of the pedagogical value, that may come up in another question. Otherwise, this seems out of scope right now.