datacarpentry / R-ecology-lesson

Data Analysis and Visualization in R for Ecologists
https://datacarpentry.org/R-ecology-lesson/
Other
314 stars 508 forks source link

Link to Tidy Data Tutor to help visualize data transformations step-by-step when using pipes #783

Open samanthacsik opened 2 years ago

samanthacsik commented 2 years ago

There's a super cool new browser-based tool called Tidy Data Tutor (and the Pandas version, Pandas Tutor), which lets you visualize how a data frame changes at each step of a data analysis/transformation pipeline. I've found this to be a super helpful teaching tool for workshops where I am introducing the pipe operator, %>% for the first time. Tidy Data Tutor will break down each step of your pipeline (i.e. at each %>%) and show exactly how the data frame is altered in that step. I've created a simple example demonstrating the dplyr functions filter(), select() and arrange(), or see the screenshots below:

Step 1: create some data in the browser-based editor (here I created a mini version of the portal_data_joined.csv used in this workshop) Screen Shot 2021-12-21 at 4 12 05 PM

Step 2: Visualize the data analysis pipeline Screen Shot 2021-12-21 at 4 11 20 PM

It's easy to embed the reproducible pipeline visualization into lesson materials using the "Sharable URL" at the bottom of the page. It is important to note that because this is a super new tool (I think it was released only about two weeks ago), it may still be a little buggy.

Still, I think it could add value to the Data Manipulation using dyply and tidyr episode in the Data Analysis and Visualization in R for Ecologists Data Carpentries workshop (or any workshop where pipes are taught), particularly after the first example of a pipe: Screen Shot 2021-12-21 at 4 23 18 PM

One suggestion may be to include language immediately following the text in the screenshot above that states something like:

...Since %>% takes the object on its left and passes it as the first argument to the function on its right, we don’t need to explicitly include the data frame as an argument to the filter() andselect() functions any more. To understand the step-wise transformations taking place each time a pipe is used to string together tidyverse functions, you can explore the output of this new online tool, Tidy Data Tutor.

tobyhodges commented 4 months ago

Thanks @samanthacsik for opening this issue. The lesson underwent a major update and reorganisation when https://github.com/datacarpentry/R-ecology-lesson/pull/887 was merged. Although this issue refers to content in a version of the lesson before that update took place, the overall suggestion may still be relevant.