hannesdatta / course-dprep

This repository hosts the course website of Tilburg University's open education class on "Data Preparation and Workflow Management" (dPrep) - start managing your empirical research projects efficiently!
https://dprep.hannesdatta.com
58 stars 118 forks source link

add solutions to slide deck #137

Closed hannesdatta closed 3 months ago

hannesdatta commented 2 years ago

Here's the source code of today's tutorial. It would be fantastic if you could take the .rpres of today's tutorial and add the solutions to it.

Check the .rpres for how I did it. It's also possible to add NEW SLIDES with the source code. Just try out what works.

Contribute via a PR - good practice.

Thanks!

library(tidyverse)
download.file('https://github.com/hannesdatta/course-dprep/raw/master/content/docs/tutorials/data-preparation/data.zip', 'data.zip')
unzip('data.zip')

streams <- read_csv('streams.csv')

songs <- read_csv('songs.csv')

country_codes <- read_csv2('country_codes.csv')
country_codes <- read_delim('country_codes.csv', delim = ';')

tmp = streams %>% count(song_id, date, country)
table(tmp$n)

# trying to delete duplicates
streams_without_duplicates = streams %>% distinct(song_id, date, country, .keep_all= TRUE)

# check whether it worked
streams_without_duplicates %>% count(song_id, date, country) %>% count(n)

### Do: Let's ask *better* questions using tidyverse

#1)
streams_without_duplicates %>% filter(country=="BE") %>% count()
#2)
streams_without_duplicates %>% arrange(desc(streams))
#3)
songs %>% filter(song_id==21831898)

#4)
streams_without_duplicates$revenue <- 0.0038 * streams_without_duplicates$streams

streams_without_duplicates <- streams_without_duplicates %>% mutate(revenue = streams * 0.0038)

# break if we haven't identified the correct unit of analysis/primary key of the dataset
stopifnot(all(tmp$n)==1)

streams_without_duplicates %>% group_by(country) %>% summarise(number_of_observations=n(),
                                                               revenue = sum(revenue))

# 

streams2 <- streams_without_duplicates %>% group_by(song_id) %>% summarise(totstreams = sum(streams))

#
songs %>% count(artists)

songs %>% count(artists) %>% filter(artists=='Ellie Goulding')

songs %>% count(artists) %>% filter(grepl('Ellie Goulding', artists, ignore.case=T))

grepl('Hannes', c('Hannes','Jesper', 'Bo'))

streams2 <- streams_without_duplicates
hannesdatta commented 2 years ago

@bodr101

bodr101 commented 2 years ago

I can't seem to find a .Rpres file in the folder 'data exploration'. Is this missing or am I just not looking correctly? I can see a .Rpres for the data preparation for example, but not of today's tutorial.

hannesdatta commented 2 years ago

content/docs/tutorials/data-preparation/tutorial.Rpres

hannesdatta commented 2 years ago

@bodr101, did you incorporate this after all? I think you did, right? Let's close this issue then.

bodr101 commented 2 years ago

Think I totally forgot to actually incorporate this, but should be good now. Hopefully it worked and I added extra slides with the answers and proper coding format, etc.

How can I see btw what the changes on the deck will look like without committing? Because the preview only shows me the changing in the code and not what it will look like in the deck?

hannesdatta commented 2 years ago

Uhh to be able to see the deck you have to render it yourself in Rstudio (“save as website”) and replace the HTML file via a commit.

From: Bo de Ruijter @.> Sent: Thursday, October 13, 2022 3:30 PM To: hannesdatta/course-dprep @.> Cc: Hannes Datta @.>; Author @.> Subject: Re: [hannesdatta/course-dprep] add solutions to slide deck (Issue #137)

Think I totally forgot to actually incorporate this, but should be good now. Hopefully it worked and I added extra slides with the answers and proper coding format, etc.

How can I see btw what the changes on the deck will look like without committing? Because the preview only shows me the changing in the code and not what it will look like in the deck?

— Reply to this email directly, view it on GitHubhttps://github.com/hannesdatta/course-dprep/issues/137#issuecomment-1277618596, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD6KNIXQO7YAG44NI4XIL4LWDAFGNANCNFSM6AAAAAAQSZZJSE. You are receiving this because you authored the thread.Message ID: @.**@.>>