datacarpentry / spreadsheet-ecology-lesson

Data Organization in Spreadsheets for Ecologists
https://datacarpentry.org/spreadsheet-ecology-lesson
Other
37 stars 141 forks source link

Several Issues with 04-Quality-Control #247

Closed Darren-Intersect closed 4 years ago

Darren-Intersect commented 5 years ago
  1. No mention with data validation that the rules only apply to new - not existing data
  2. There is .section included regarding readme file that is not included in the learning objectives
  3. Exercises for the data validation. There are currently none. It would be good to have one to highlight some of the options that are available. Setting the min/man date range for the sample and testing that. Creating a list for M/F and testing it Setting input and error messages.
hoytpr commented 5 years ago

Hi @Darren-Intersect and thanks for working on this lesson. Your comments are very much appreciated. Let's see if we can address these separately:

No mention with data validation that the rules only apply to new - not existing data

I see your point, although the next section/exercise on cleaning data does deal with data cleanup (existing data), because Excel doesn't have a "check my data" button. That's why the lesson refers to other lessons when it says: "It is nice to do be able to do these scans in spreadsheets, but we also can do these checks in a programming language like R, or in OpenRefine or SQL."

Maybe you could create links to these lessons for checking your data, and generate a pull request that would make these links "active"?

There is .section included regarding readme file that is not included in the learning objectives. Exercises for the data validation. There are currently none.

Excellent point! A example Readme file could be included that relates directly to the data changes made in the lesson. Thanks for great feedback on this. As maintainers we always encourage the community to contribute these changes with a Pull Request. Is this something you'd be interested in doing?

It would be good to have one to highlight some of the options that are available. Setting the min/man date range for the sample and testing that. Creating a list for M/F and testing it Setting input and error messages.

Hmmmm. This seems to be covered in the Data validation section. Could you be more specific about Where this would need to be present in this lesson?

hoytpr commented 5 years ago

After my last answer I did notice that Excel does have a data check, where it will circle invalid data after you've made changes to the data types and limits: check my data in excel

It might be a good idea to make a point of this in the exercise.

hoytpr commented 5 years ago

It's been a while since this issue was filed. It looks like the first two parts of the issue are fixed, but the third part: "No Exercise for data validation" is still relevant. @Darren-Intersect I wonder if you would be interested in submitting a pull request on 04-quality-control.md where you include a simple data validation exercise as you described? Or with another type of validation e.g. "number of characters less than or equal to".

hoytpr commented 4 years ago

This is very similar to issue #283 so a few more examples of the Data Validation seem to be in order. Comments @cbahlai or @ErinBecker Should one of us put in a PR?

hoytpr commented 4 years ago

Closing after recent merges.