swcarpentry / r-novice-gapminder

R for Reproducible Scientific Analysis
http://swcarpentry.github.io/r-novice-gapminder/
Other
164 stars 538 forks source link

Missing instructions for where the cats data frame comes from #307

Closed sandrabrosda closed 6 years ago

sandrabrosda commented 7 years ago

I was helping with this R session and there was confusion where to get the cats data frame from or how to create it. Maybe it should be explained first how to create a data frame from scratch and then go into how to manipulate it. The missing command for the cats data frame could be cats <- data.frame(coat = c("calico", "black", "tabby"), weight = c(2.1, 5.0,3.2), likes_string = c(1, 0, 1))

Sandra

naupaka commented 7 years ago

Hi @sandrabrosda thanks for the note. The beginning of episode 4 has instructions for making this file in a text editor. Presumably this would go along with some discussion of the csv format. Creating the data frame directly in R would then obviate the need for the bit on read.csv().

However, I agree it is a bit odd to ask folks to create a csv by hand, since this this something they will almost never do in normal use. I think it might be reasonable to change this lesson to have the data frame created within R instead, with the code you included. Would you be willing to submit a PR for that change that I could review?

sandrabrosda commented 7 years ago

Hi Naupaka,

I’m happy to submit a PR for that change. However, I have to figure out how to do it first and I’m away for almost the rest of the month. But if you’re happy for me to do it next month I’ll do it.

Regards, Sandra

From: Naupaka Zimmerman [mailto:notifications@github.com] Sent: Monday, 4 September 2017 11:50 AM To: swcarpentry/r-novice-gapminder r-novice-gapminder@noreply.github.com Cc: Sandra Brosda s.brosda@uq.edu.au; Mention mention@noreply.github.com Subject: Re: [swcarpentry/r-novice-gapminder] Missing instructions for where the cats data frame comes from (#307)

Hi @sandrabrosdahttps://github.com/sandrabrosda thanks for the note. The beginning of episode 4http://swcarpentry.github.io/r-novice-gapminder/04-data-structures-part1/ has instructions for making this file in a text editor. Presumably this would go along with some discussion of the csv format. Creating the data frame directly in R would then obviate the need for the bit on read.csv().

However, I agree it is a bit odd to ask folks to create a csv by hand, since this this something they will almost never do in normal use. I think it might be reasonable to change this lesson to have the data frame created within R instead, with the code you included. Would you be willing to submit a PR for that change that I could review?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/swcarpentry/r-novice-gapminder/issues/307#issuecomment-326848750, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Ad-oUuKhHu2pNCDRnIGnCab8GuQOxNKXks5se1dRgaJpZM4PFiq5.

naupaka commented 7 years ago

Sure, no rush. There are some instructions for contributing, here. I'm also happy to help once you get to that point.

missaugustina commented 7 years ago

I used this lesson as my example lesson for the instructor training course. I used "download.file" in an R notebook then read.csv.

download.file(
  "https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/feline-data.csv", "cats_swcarp.csv")

cats <- read.csv("cats_swcarp.csv")
naupaka commented 7 years ago

@missaugustina you can also read csv directly from a URL if you like.

cats <- read.csv("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/feline-data.csv")
missaugustina commented 7 years ago

I don't recommend reading directly from the url for 2 reasons: 1) possible wifi issues and 2) a local copy ensures you know what version of the file you are working with.

mfoos commented 7 years ago

We just created this data.frame by hand in a workshop and generated some frustration because apparently if you create a csv that way, but don't follow it with a blank line, read.csv won't read it

naupaka commented 7 years ago

@mfoos do you think it would be better to just create the data frame directly and bypass the csv step?

mfoos commented 7 years ago

Yes - obviously reading csvs in is the most realistic, but for little examples like this, I hate to tempt the whitespace gods.

cmaimone commented 6 years ago

A related, but I think more serious, issue is in the Data Types section of Part 4, there's reference to a file data/feline-data_v2.csv that isn't created (or downloaded) before being referenced. However the initial creation of the cats data.frame is resolved, it should probably be applied to this part as well.

naupaka commented 6 years ago

Closed by #338