visualizingthefuture / examples-repository

Repository for https://visualizingthefuture.github.io/examples-repository
Other
5 stars 2 forks source link

create safe parsing of lists scripts within google sheets #87

Closed cassws closed 3 years ago

cassws commented 3 years ago

as per #63

amzoss commented 3 years ago

problem: if Google Script autogenerates pids, could accidentally change a pid that has already gone public, which would break a URL

potential solution: in manual curation of data, curator can clean data up a bit (standardize free text stuff, for example, and then manually assign a pid. Then, script can use pid as indicator that the record is ready to go and start cleaning that record.

question: should manual curation happen in a separate tab (or even sheet) from the one populated by the Google Form? if so, should there be a combined cleaned tab, or should there be a separate cleaned tab for each collection? could end up with three curated tabs (datavis, datasets, other) and then three more tabs that have been cleaned by the script and safe lists. really would rather not have the script overwriting stuff, and ideally would not have the curator overwriting the original form submission either.

amzoss commented 3 years ago

Current solution: one step of manually cleaning to add pid to approved examples, standardize free-text responses, and manually add pipes to free-text lists. Google Sheet script then does automatic cleaning for lists made from multiple-choice submission questions and splits the submissions into separate tabs for each collection. Can then export tab as .csv, save .csv into repo's data-tools directory, and run the python script on the .csvs to generate json.