Is your feature request related to a problem? Please describe.
When demoing this pipeline, there has been a lot of questions around utilizing a tool to allow for versioning of datasets. This will allow for a more robust experiment replication for use in reproducibility studies.
Describe the solution you'd like
I would like to be able to utilizing Pachyderm to enable this function.
Describe alternatives you've considered
Currently I am pickling out the data sets into the respective ./experiments/ directory. This does not allow for reasonable versioning and tracking of the data sets.
Additional context
Integrate back end storage for persisting the versioned data sets.
Is your feature request related to a problem? Please describe. When demoing this pipeline, there has been a lot of questions around utilizing a tool to allow for versioning of datasets. This will allow for a more robust experiment replication for use in reproducibility studies.
Describe the solution you'd like I would like to be able to utilizing Pachyderm to enable this function.
Describe alternatives you've considered Currently I am pickling out the data sets into the respective
./experiments/
directory. This does not allow for reasonable versioning and tracking of the data sets.Additional context Integrate back end storage for persisting the versioned data sets.