h4sci / h4sci-tasks

Tasks and Exercises
7 stars 0 forks source link

Block 2: Task 2 -- Dataset Types #6

Open mbannert opened 3 years ago

mbannert commented 3 years ago

Find an example for each of the following type of datasets:

and come up with an representation for each dataset in memory, i.e., in an R object AND on disk, i.e., written to a file.

Find = Look it up online. Take it from the FSO, the KOF website or public data providers you work with. You may also want to simulate / draw data like suggested below:

set.seed(123)
d <- rnorm(1000)

Keep the data in memory (in your R session) and find a suitable format to store them on disk. Play around reading and writing data. Discuss advantages / disadvantages in your group in order to evaluate them in class together.

Make also sure to run some experiments with the ".RData" format (created with save()). What could be the disadvantage of such a flexible format ?

hints: see ts(), xts() from the xts package, data.frame, tibbles (from tidyverse) and data.table as well as lists and the jsonlite package.

hint2: The fivethirtyeight package from Nate Silver's blog 538 is pretty cool. It contains lots of datasets behind 538 stories.