International-Soil-Radiocarbon-Database / ISRaD

Repository for the development and release of ISRaD data and tools
https://international-soil-radiocarbon-database.github.io/ISRaD/
24 stars 15 forks source link

Idea: Remove data from R package #130

Closed aahoyt closed 5 years ago

aahoyt commented 5 years ago

This was discussed in detail at AGU.

Possible idea: remove the data object itself from the R package. Data would be hosted on Github. Various options for the data include -the package could either have a function to call it, from the website, or the user could download the data object. Or the user could run the compile function on the list object, etc etc.

Reasoning: the data is constantly being updated, and as a result, will require frequent CRAN package updates. Ideally, this would reduce the frequency with which the package needs to be updated on CRAN, and frequency of the user needing to update their version of the package. It is more intuitive to download the latest data file than to have to update the package each time, which people may forget to do, and also takes awhile. Additionally, no one will want to use the CRAN package if it doesn't have the latest data, they will always choose to download the dev branch.

ktoddbrown commented 5 years ago

I think that it's important to version the data object itself for reproducibility of studies/analysis. But otherwise would agree that separating the scripts and the data object makes a lot of sense. If you can't find other solutions ISCN may be able to mediate hosting with LBNL and mint DOIs for different versions. There are also a number of archives that would likely be happy to work with you to maintain versioning (EDI, ESS-DIVE, Pangeae come to mind).

greymonroe commented 5 years ago

The reasoning to remove the data object from the package makes sense to me. We could add a function that downloads the data from the archive where we chose to keep it. Lets disscuss in San Diego

On Thu, Jan 3, 2019 at 12:24 PM Kathe Todd-Brown notifications@github.com wrote:

I think that it's important to version the data object itself for reproducibility of studies/analysis. But otherwise would agree that separating the scripts and the data object makes a lot of sense. If you can't find other solutions ISCN may be able to mediate hosting with LBNL and mint DOIs for different versions. There are also a number of archives that would likely be happy to work with you to maintain versioning (EDI, ESS-DIVE, Pangeae come to mind).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/International-Soil-Radiocarbon-Database/ISRaD/issues/130#issuecomment-451214675, or mute the thread https://github.com/notifications/unsubscribe-auth/AP5w_EAyNBSV3yJ3zIA-eLvuN90tt-bdks5u_jzAgaJpZM4ZnaNM .

greymonroe commented 5 years ago

discussed and created new issue with to do tasks