StirlingCodingClub / studyGroup

Gather together a group to skill-share, co-work, and create community
http://StirlingCodingClub.github.io/studyGroup/
Other
2 stars 1 forks source link

Question: using gather() in dplyr #17

Open mattnuttall00 opened 5 years ago

mattnuttall00 commented 5 years ago

Hi all,

My office pal Paulo is trying to manipulate his data into a "tidy" format, but it struggling with the gather() function. I've not really used that function much and so am not much help, so we were hoping some of the tidyverse wizzes could lend a quick hand.

His data is temperature time series, and at the moment looks like this:

Years Jan_min Jan_max Feb_min Feb_max 1960 xx xx xx xx 1961 xx xx xx xx

with a structure = 58 obvs. of 28 variables

He would like it to look like:

year month max/min temp 1960 Jan max xx 1960 Jan min xx

etc.

I thought it was something like

data %<% gather(key=month, value=max/min, 2:28)

But that ain't working. I have attached the data if that helps. paulo_data.xlsx

Thanks!

anna-deasey commented 5 years ago

paulo <- read_csv("paulo_data.csv") # reads in data

paulo <- paulo %>% gather(key = month_min_max, # name of new key column value = value, # name of new value column Jan_min, Jan_max, Feb_min, Feb_max, Mar_min, Mar_max, Apr_min, Apr_max, May_min, May_max, Jun_min, Jun_max, Jul_min, Jul_max, Aug_min, Aug_max, Sep_min, Sep_max, Oct_min, Oct_max, Nov_min, Nov_max, Dec_min, Dec_max) # list all of the columns to be used to make the new key and value pair columns

paulo <- paulo %>% separate('month_min_max', c('month', 'minmax'), sep = "") # separates out month and min_max into different columns

names(paulo) <- names(paulo) %>% stringr::str_replaceall("\s","") # this adds _'s to all whitespace in the column names. whitespace is bad

paulo <- paulo %>% select(Years, month, min_max, value, Annual_Minimum, Annual_Maximum, Mean_of_the_Year) # this reorders the columns into something sensible

mattnuttall00 commented 5 years ago

Awesome, many thanks Anna!

If anyone else reads this in the future, for some reason the line removing white space from the column names didn't work on my computer (but did on Anna's), whereas the below code I found did work on my computer, but not on Anna's. Go figure


names(paulo) <- make.names(names(paulo), unique = TRUE)
adamaki commented 5 years ago

Hi Matt,

I often find that a function works fine on one machine but not on another and it's usually due to conflicting packages using the same function name but with different arguments and the one you want has been masked. Detaching the conflicting package usually works, or specifying the package you want in the code, just as Anna did with stringr::str_replace_all, which is why it's a bit odd that this didn't work on your machine! Anyway, it might be worth investigating further if you want to find the cause of the problem...