sktime / mlaut

Other
24 stars 5 forks source link

Split Data algorithm #4

Closed ViktorKaz closed 6 years ago

ViktorKaz commented 6 years ago

split data - should not duplicate entire dataset. Create index to elements of X_train, y_train, X_test, y_test or create iterator object

fkiraly commented 6 years ago

splitter/subsetter might even be nicer than an iterator, since many algorithms make use of array/matrix operations. Have a look how this is currently done in the model_selection module of sklearn, for example with KFold. There's really no reason to re-invent the wheel - unless of course the wheel is not circular but polygonal, say.

ViktorKaz commented 6 years ago

This is now implemented. The train/test indices are stored in the output hdf5 in the default group '/split_dts_idx'.