Management of Data Sets

pysal / geopyter

GeoPyTeR: Geographical Python Teaching Resource

68 stars 17 forks source link

Management of Data Sets #16

Open jreades opened 7 years ago

jreades commented 7 years ago

Enforce data set consistency within atoms -- So we should use one data set for all of the atoms in one group and then, say, a different data set for the ML.

jreades commented 7 years ago

Should there be a global data directory (to reuse datasets across atoms) or is data replicated potentially in each atom?

jreades commented 7 years ago

Ensure that someone isn't left hanging if they're not online? (can embed simple data sets in GitHub and use PySAL examples too for more advanced tutorials).

jreades commented 7 years ago

Need to remember to copy data used into final 'compiled' notebook directory. But then need to think about size of the data sets as well.

jreades commented 7 years ago

Was wondering if perhaps the best way is to make use of the data sets distributed with PySAL? Or we could have in the README for each atom: "Tutorials in this folder should use one of the following two data sets: A, B." We don't need to allow everyone to bring their own data to the party as that's not the purpose...