mbannert / boar-2018

Building Online Apps with R - the 2018 version of the Course
1 stars 0 forks source link

Masterliste Psychologie 2018 #1

Open Auer27 opened 5 years ago

Auer27 commented 5 years ago

Google Docs API für diese Liste: https://docs.google.com/spreadsheets/d/1G5Rcj8eb-Vsbpb0u47KWvddKnXDBa7PKpJOHkKWn7-Q/edit#gid=0

https://psyfako.org/index.php/masterplatz-liste/

mbannert commented 5 years ago

@Auer27 if you want to access the sheet directly, @jennybc 's googlesheets API wrapper R package is the way to go here. She works for R Studio, has done a lot of great work and talks on R and the package seems pretty mature. Also there is a googlesheets4 package in the making which is going to work with v4 of the Google APIs. However, I'd go with the CRAN version of the original package for now.

However, I see two possible pitfalls:

1.) Understanding Google's API and terminology... public vs published etc. You definitely need to do some research here. The problem is that reading from your own google account is pretty straight forward but this is only suitable when using R interactively. Because you want to do it in an app, you need to find another solution in order to access public documents published by others (psyfako). I've seen @jennybc has an example of doing exactly that, i.e., public shiny app.

2.) The document itself is pretty ugly. If you look at the number of seats etc. some cells contain nested data, splitting it up for fall and spring term etc. However, I think it would be a nice tidyverse exercise to clean it up - at least partly. Also it would be nice to give Psyfako Feedback on how to do it properly. It would certainly help to make them aware that people in 2018 not just look at data but actually process data.

jennybc commented 5 years ago

Here's me reading that sheet with no auth whatsoever, using googlesheets4 (and, optionally, googledrive). googlesheets4 uses the Sheets v4 API, which is substantially less fiddly around issues of Sheet visibility. It will go to CRAN in the next month. I would regard it as safe to plan around. It will be a substantial improvement over googlesheets (original), both because it's using the newer API and I have a better idea of what I'm doing.

library(googledrive)   # used to isolate file ID from URL
library(googlesheets4)
#> 
#> Attaching package: 'googlesheets4'
#> The following objects are masked from 'package:googledrive':
#> 
#>     request_generate, request_make

## announce that I want to send un-authorized requests
## will be very useful for, say, use in a Shiny app
drive_deauth()
sheets_deauth()

ss_url <- "https://docs.google.com/spreadsheets/d/1G5Rcj8eb-Vsbpb0u47KWvddKnXDBa7PKpJOHkKWn7-Q/edit#gid=0"
read_sheet(as_id(ss_url), range = "A4:N")
#> Reading from 'Masterliste Psychologie 2018'
#> Range "'Masterliste'!A4:N998"
#> # A tibble: 61 x 14
#>    Stadt `Name des Studi… `Schwerpunkte (… Plätze Bewerbungszeitr…
#>    <chr> <chr>            <chr>            <list> <list>          
#>  1 Aach… Psychologie (M.… "empirischen Gr… <NULL> <chr [1]>       
#>  2 Bamb… Psychologie (M.… Allgemeiner Mas… <chr … <chr [1]>       
#>  3 Berl… Psychologie (M.… <NA>             <NULL> <NULL>          
#>  4 Berl… Psychologie (M.… "Methodenlehre … <dbl … <chr [1]>       
#>  5 Biel… Psychologie (M.… "Allgemeiner Ma… <dbl … <chr [1]>       
#>  6 Boch… Klinische Psych… Klinische Psych… <dbl … <chr [1]>       
#>  7 Boch… Psychologie und… Kognitive Neuro… <dbl … <chr [1]>       
#>  8 Boch… Wirtschaftspsyc… Wirtschaftspsyc… <dbl … <chr [1]>       
#>  9 Bonn  Psychologie (M.… "Allgemeiner Ma… <chr … <chr [1]>       
#> 10 Brau… Psychologie (M.… -                <NULL> <chr [1]>       
#> # … with 51 more rows, and 9 more variables: `Zeugnis nach-\nreichen
#> #   bis` <list>, `Einzusendende\nUnterlagen` <chr>, `Datum der
#> #   ersten\nZulassungsaussendungen` <chr>, `NC WS17/18` <list>, `NC WS
#> #   16/17` <list>, `Wartesemester\n/ Losverfahren` <list>,
#> #   `Zulassungskritierien & -beurteilung` <list>, Anmerkungen <list>,
#> #   `Links: Webseite Studiengang / Studienverlaufsplan / Modulhandbuch /
#> #   Zulassungsordnung` <chr>

## no googledrive at all
ss_id <- "1G5Rcj8eb-Vsbpb0u47KWvddKnXDBa7PKpJOHkKWn7-Q"
read_sheet(ss_id, range = "A4:N")
#> Reading from 'Masterliste Psychologie 2018'
#> Range "'Masterliste'!A4:N998"
#> # A tibble: 61 x 14
#>    Stadt `Name des Studi… `Schwerpunkte (… Plätze Bewerbungszeitr…
#>    <chr> <chr>            <chr>            <list> <list>          
#>  1 Aach… Psychologie (M.… "empirischen Gr… <NULL> <chr [1]>       
#>  2 Bamb… Psychologie (M.… Allgemeiner Mas… <chr … <chr [1]>       
#>  3 Berl… Psychologie (M.… <NA>             <NULL> <NULL>          
#>  4 Berl… Psychologie (M.… "Methodenlehre … <dbl … <chr [1]>       
#>  5 Biel… Psychologie (M.… "Allgemeiner Ma… <dbl … <chr [1]>       
#>  6 Boch… Klinische Psych… Klinische Psych… <dbl … <chr [1]>       
#>  7 Boch… Psychologie und… Kognitive Neuro… <dbl … <chr [1]>       
#>  8 Boch… Wirtschaftspsyc… Wirtschaftspsyc… <dbl … <chr [1]>       
#>  9 Bonn  Psychologie (M.… "Allgemeiner Ma… <chr … <chr [1]>       
#> 10 Brau… Psychologie (M.… -                <NULL> <chr [1]>       
#> # … with 51 more rows, and 9 more variables: `Zeugnis nach-\nreichen
#> #   bis` <list>, `Einzusendende\nUnterlagen` <chr>, `Datum der
#> #   ersten\nZulassungsaussendungen` <chr>, `NC WS17/18` <list>, `NC WS
#> #   16/17` <list>, `Wartesemester\n/ Losverfahren` <list>,
#> #   `Zulassungskritierien & -beurteilung` <list>, Anmerkungen <list>,
#> #   `Links: Webseite Studiengang / Studienverlaufsplan / Modulhandbuch /
#> #   Zulassungsordnung` <chr>

Created on 2018-12-02 by the reprex package (v0.2.1.9000)

mbannert commented 5 years ago

@Auer27 I guess you got a pretty good kickstart here. I recommend to follow her advice and go with googlesheets4 here as @jennybc is the author of the package. Also, remember our discussion about reproducibility and notice the nice reproducible example here (created with the reprex package).

Given that the original provider has some homework to do w.r.t tidy data, I assume the interface might change here and there. Thus I'd go with the simplest possible solution, i.e., no googledrive at all. Hopefully once you can show them the app they might understand the value of keeping their interface (sheet) consistent.

Auer27 commented 5 years ago

Hey @jennybc and @mbannert, thank you so much for your ideas. I'm sure it'll help us a lot! :) Our group will have a closer look at the packages during christmas break. We will keep you updated how the project is going.