Closed ErinBecker closed 7 years ago
We have used it in both workshops that I have been involved in, what worked will in the first workshop was a "clean" version of the species data that was supplied to the participants. This saved time and gave the same data to work with to participants, I've got a clean version of 2013 Field data that could possibly be added as a separate download to base the question on.
Thanks @zjsteyn, could you share this data?
Please find attached,
The data is tabular but still have errors that will work will in a sorting or a data validation exercise. The weight_grams column still has cells with no data and the letter "g" in some of the cells.
A sorting exercise will show how outliers short to the top or bottom, it can also work will with a data validation exercise where data validation can circle values that do not conform. I will draft two scenarios for consideration.
The xlsx file. survey_data_messy_quality_control.xlsx
Possible Validation Exercise using Sort
Exercise 1
Challenge: Sort the weight_grams column in your spreadsheet program from Smallets to Largest. What do you notice?
(Remember to expand your sort in order to prevent data corruption. Expanding your sort ensures that the the all the data in one row move together instead of only sorting a single column in isolation.)
Solution
Note how the outliers in the case of the 2013 data sort to the bottom of the tabular data. The cells containing no data values as well as the cells where the letter "g" was included can be found towards the bottom of the tabular data.
{: .solution}
{: .challenge}
Possible Validation Exercise using Data Validation
Exercise 2
Challenge: Utilising Data Validation (Excel) or Validity (Libre Office Calc) to validate whether the values entered in the Weigh_grams colum are valid integers (whole numbers) between 1 and 150.
Solution
In order to identify all the values that are not whole numbers, you have to select all the cells in the Weight_grams colum and click on Data Validation. You then need to alter the Data Validation input screen to resemble the following screenshot.
After clicking on OK you need to select Circle Invalid Data from the Data Validation drop down.
Note how all the cells where values are not whole numbers are indicated once you run the validation function on them.
{: .solution}
{: .challenge}
Will be refining during 7th Bug BBQ
@zjsteyn I figured out how to update the data link and merged this. Thanks for your contribution, it's great!
In 04-quality-control.md under "Sorting" there is an exercise that isn't marked as one. The lines starting with "When you do this sort . . ." and "Try sorting" could be called out as an exercise.