Open bencomp opened 2 years ago
I do realise now that this has been mentioned in part in #79 and also relates to #56 and #38.
I think we should look at the Library Carpentry lesson on OpenRefine for clearer use cases in the introduction episode: splitting data elements into different columns, normalising date formats and maybe matching/enhancing. This would go instead of the Motivations section, which is currently written for potential instructors (I feel).
Let's replace the Features and Getting help sections with How is OR different from spreadsheet applications? and When would you write a script instead of using OR?.
From #37:
Perhaps it's also useful to distinguish OR from using SQL with a relational database. SQL also allows selection of rows and creating derivative columns. The cross
function allows to join data from different projects, like JOIN
in SQL. (cross
is not currently part of the lesson, but I have used it myself.)
Remember to remove the mention of this issue in the Instructor note in the Introduction section when this issue is being resolved. See #183.
I have taught the OpenRefine lesson a few times; most recently today. Even though I always try to explain when you could choose OpenRefine for a problem, and how to compare OpenRefine to spreadsheets and writing a script, students keep asking for more explanation and comparisons. In our workshop the OpenRefine lesson is between Data organisation in spreadsheets and Introduction to R and that is also how I tried to frame OpenRefine: it shows your data like a spreadsheet application, but it has powers like a programming environment.
Seeing how I keep struggling to explain it well, even with years of experience with OR, we should probably improve the lesson materials.
It was suggested by helpers that referring back to my situating OR between spreadsheets and programming in the introduction later in the lesson might help, but the introduction episode should provide more context first.