Open scienceisfiction opened 8 years ago
Hi @scienceisfiction, you might want to talk to @coreenforbes because she did a whole bunch of research on cloud backup services for our lab group meeting.
@BIOL548O/all ,
Great question @scienceisfiction ! Does anybody else want to explore this storing large datasets on the web? I know @JoeyBernhardt was telling me about having this challenge with her data, too. We could certainly study some of the solutions below as a class, if people are interested!
Here are some ideas:
aws.s3
package. I can help you set that up if you want!rdrop2
to access your data. You don't need to have that folder synchronized with your computer (ie you could just leave your stuff on the cloud and load it into R when you need it)Does that help? Would anyone else like to add anything?
I have summoned the Science Nerds of Twitter and they have spoken: S3 is a popular choice, as is Dropbox.
However I also learned that readr
can read zipped csvs! @JoeyBernhardt try this out
library(readr)
library(dplyr)
mtcars_path <- tempfile(fileext = ".csv")
write_csv(mtcars, mtcars_path)
## this zips up the file
zipname <- paste0(mtcars_path, ".zip")
zip(zipname, mtcars_path)
## then you can read it
mt_from_zip <- read_csv(zipname)
oh cool! thanks @aammd! I will try this :smile:
These are all really helpful--though I don't actually have time to pursue them now! After the class is over, will these discussions live on somewhere? I'd love to be able to circle back to this and some of the other tips and tricks that have come up in other discussions after term is over and I have a little more time to explore?
This Discussion thread will live forever
I was wondering if anyone has any recommendations for cloud storage for large and varied data sets (raw data may be video, image, etc. files, not just txt or csv). Something that can handle large and weird objects but maybe also has some helpful features that work with R, Git, etc.?
Thanks! Melissa A