Closed DanielleQuinn closed 3 years ago
DQ, your idea sounds like a good one, and I've told my grad student to do much the same (so she brings me the good stuff!). My understanding is that we're focusing on keeping tidy data in spreadsheets, not focusing so much on tidy folder organization. I'd be happy for others to chime in.
In the "Formatting Data Tables in Spreadsheets" segment of the lesson, it suggests using multiple tabs in a spreadsheet file when you are cleaning up or modifying your data. I worry that having all of your original and modified data, along with your notes about how / why the data have been modified in a single file worrisome and perhaps a little clunky. If I may, I'd like to put forward an alternate method, just as a potential discussion point.
What I teach students (and practice myself) is to create a folder called "Data" in the appropriate location (i.e. Thesis > Data). Within that folder, I create three sub folders:
I have a text file in the Data folder that documents what changes occurred at each date / iteration.
When using R, for example, the data is always imported from the Working Data folder and thus is always using the most up to date version of the data. This also ensures that if the Working Data folder is being shared or used by multiple people, there is no chance of accidentally making changes to or analyzing a previous version of the data, as it is now stored separately in the Archived Data folder and is identified as no longer being used by the date in the file name.
Thoughts or suggestions are more than welcome!
-DQ