IBM / mi-prometheus

Enabling reproducible Machine Learning research
http://mi-prometheus.rtfd.io/
Apache License 2.0
42 stars 18 forks source link

Implement index_splitter worker #30

Closed tkornuta-ibm closed 5 years ago

tkornuta-ibm commented 5 years ago

The goal of that worker is to create files with indices that split the dataset into a distinctive subsets.

The required usage: 1) user provides the output dir where files with indices will be created (--o) 2) user provides the problem name (--p) OR length of the dataset (--l) 3) user provides split --s (value from 1 to l-2, which are border cases when one of the other split will contain a single index) 3) additional option: random_sampling (--r, DEFAULT: true) -- when random_sampling is on, both files will contain list of indices -- when off, both files will contain ranges, i.e. [0, s-1] and [s, l] respectivelly

vmarois commented 5 years ago

Adressed in #36