epiverse-trace / epiparameter

R package with library of epidemiological parameters for infectious diseases and functions and classes for working with parameters
https://epiverse-trace.github.io/epiparameter
Other
32 stars 11 forks source link

Data storage #4

Open sbfnk opened 2 years ago

sbfnk commented 2 years ago

It might be worth thinking about what we want to store (e.g. individual data points if available, or only summary statistics, sample size) and how to store them - csv probably fine for now but if we want to invite external contribution something like a curated google sheet might be more amenable (and could easily be imported here, or regularly updated).

thibautjombart commented 2 years ago

Just to flag a related issue with the current mode of storage.

I can see some pros/cons, listing some below. After writing this I might lean towards a dual system.

Store as csv (current version)

Assuming files are used for internal purpose and moved to inst/ (see related issue).

Pros

Cons

Store online

Assuming the package pulls in real time content off the web eg using googlesheet4.

Pros

Cons

Dual system

This would involve:

  1. having a user-facing database people can contribute to
  2. making periodic releases of the database (i.e. a timestamped and curated version)
  3. pulling the latest release into the package as a local csv file
  4. timing package releases with database releases

I think it would have most of the pros and not many of the cons. A bit heavier maintenance on our end, but some of the process may be automated.