datacarpentry / R-ecology-lesson

Data Analysis and Visualization in R for Ecologists
https://datacarpentry.org/R-ecology-lesson/
Other
314 stars 508 forks source link

Reproducibility issue with the 'manipulating data' section output #682

Closed yyachung closed 3 years ago

yyachung commented 3 years ago

First, of all, thanks for making your tutorials available! I used the 'manipulating data' section of the tutorial with my students, and some students were able to reproduce the ending number of rows listed in the tutorial (i.e. dim(surveys_complete)), whereas the majority (including myself) could not. Most of us ended up with 30521 rows instead. I've tried to diagnose the issue myself with no success. Has this happened with anyone else?

fmichonneau commented 3 years ago

Hi @yyachung! Thanks for bringing this up. Did you read the data set using read_csv() or read.csv()? If you could post the commands you used to obtain 30521 that would be useful to understand where the mismatch might be coming from. Thanks!

yyachung commented 3 years ago

I used read.csv(), and I just re-ran with read_csv() and got the same results as listed in the tutorial (30463 obs)! I would never have thought to check that. Interesting!

Teebusch commented 3 years ago

This is a very easy to make mistake, especially for beginners. Even when copying from the instructor's screen, the difference between read.csv() and read_csv() is easy to miss, and either will work at first. Perhaps we should add a warning note or 'gotcha' to the lesson or the instructor notes?

fmichonneau commented 3 years ago

Perhaps we should add a warning note or 'gotcha' to the lesson or the instructor notes?

that sounds like a great idea!