Reproducible-Science-Curriculum / organization-RR-Jupyter

Data and project organization for Reproducible Research link to old longer version.
https://reproducible-science-curriculum.github.io/organization-RR-Jupyter/
Creative Commons Zero v1.0 Universal
7 stars 8 forks source link

Section 04 - Modifying #8

Closed choldgraf closed 6 years ago

choldgraf commented 7 years ago

https://github.com/Reproducible-Science-Curriculum/organization-RR-Jupyter/blob/gh-pages/_episodes/04-modifying.md

choldgraf commented 7 years ago

I just pushed a draft of these materials, feel free to take a look

dleehr commented 7 years ago

Open the data file in 01_cleaning using Excel (or your GUI text editor of choice)

Should we be working with a CSV file at this point? Section 02 has users copying around the original asdf.xlsx file, and not converting to CSV (though there is talk of this in the google doc). And I assume CSV is preferred for the next lesson, so perhaps we should be converting to CSV here or in Section 02.

dleehr commented 7 years ago

I think I'll plan to distribute the xlsx file and have the students export it to CSV from 00_raw to 00_cleaning, rather than copying the data in section 02

naupaka commented 7 years ago

I think the idea is that if you are going to use Excel to clean up your data, then it makes sense to export it to csv only once it has been cleaned (instead of at an intermediate step), since one of the things you may be working with in that file may have to do with formatting that would be lost when exported to csv (formatted text, cell coloring, formulas, etc). So only when those other sets of information and metadata are encoded in the file in a way that would export to plain text do you do that conversion. That was our idea, anyway.

iamciera commented 6 years ago

This lesson was dramatically shortened, therefore this section was dropped.

hlapp commented 6 years ago

I did bring it back, with instructor guide notes to skip it if the lesson is taught as part of the full curriculum, in which case it will be significantly redundant with the Data Exploration lesson.