add something to metadata schema about messiness of data

visualizingthefuture / examples-repository

Repository for https://visualizingthefuture.github.io/examples-repository

Other

5 stars 2 forks source link

add something to metadata schema about messiness of data #36

Closed amzoss closed 4 years ago

amzoss commented 4 years ago

think about "goldilocks" data concept
what are the levels of messiness?
what if some people want some messiness to the data?

elk2klein commented 4 years ago

Should we do a quantifiable scale, like 1-3, 1 being unprocessed raw data, 3 being processed and ready to use and publish, 2 being something in the middle? not sure what though.

amzoss commented 4 years ago

But different raw data will be differently clean, so it might be worth being a bit more specific? like, raw data from a survey where people were hand typing in their company names could have very messy text data, but raw data from a sensor that always outputs exactly the same thing could be very clean.

amzoss commented 4 years ago

It seems a bit too hard to create a coding scheme to capture this, and another free-text question will make the survey feel even longer. If messiness is part of the pedagogical value, that may come up in another question. Otherwise, this seems out of scope right now.