edhenry / chexnet

Implementation and fullstack pipeline for CheXNet classifier
MIT License
13 stars 12 forks source link

Pachyderm Support #24

Open edhenry opened 5 years ago

edhenry commented 5 years ago

Is your feature request related to a problem? Please describe. When demoing this pipeline, there has been a lot of questions around utilizing a tool to allow for versioning of datasets. This will allow for a more robust experiment replication for use in reproducibility studies.

Describe the solution you'd like I would like to be able to utilizing Pachyderm to enable this function.

  1. Create Ansible roles to allow for creation and deletion of minikube (an example role can be found here : https://github.com/vrischmann/ansible-role-minikube

Describe alternatives you've considered Currently I am pickling out the data sets into the respective ./experiments/ directory. This does not allow for reasonable versioning and tracking of the data sets.

Additional context Integrate back end storage for persisting the versioned data sets.