GSTT-CSC / MLOps

Framework for building ML apps
GNU General Public License v3.0
9 stars 5 forks source link

Data toolkit application #19

Closed laurencejackson closed 3 years ago

laurencejackson commented 3 years ago

It would be really useful to have an application for initialising data manifests in the root of datasets.

For example calling something like datatoolkit initin a folder will create a yaml file template with some prefilled values (e.g. dataset size, nfiles etc) and some the user can fill in (e.g. demographics).

Calling datatoolkit view in a top-level directory could display or export a dataframe that where each series is an individual project data manifest from the directories below.

This is a fairly loose project at the moment and has a lot fo scope for refinement as it is developed. Good feature project for new user or trainee.

Bioin4matics commented 3 years ago

created a new repository called "datatoolkit for MLOps"