ropensci-archive / doidata

:no_entry: ARCHIVED :no_entry:
MIT License
18 stars 2 forks source link

Default behaviors for "pit of good practice" of provenance and reproducibility #3

Open noamross opened 6 years ago

noamross commented 6 years ago

While single command to read-in data is appealing, it is also true that (a) the form of data may vary a great deal, and (b) people often want to have the downloaded file on-hand for other reasons. read_csv(doidata()) or doidata() %>% read_csv() are still fairly minimal and intuitive. Drawing ideas from fulltext::ft_get_si(), here's a scheme for default behavior that I think is still intuitive and drives users towards best practice in maintaining data provenance and credit while having access to downloaded files:

Another question is what appropriate behavior should be for versioned data. For instance, Zenodo and Figshare have DOIs that always points to the latest version of data, and separate DOIs for each version. One possibility: