JuliaStats / MLBase.jl

A set of functions to support the development of machine learning algorithms
MIT License
186 stars 63 forks source link

Add random train-test splitting #28

Open abbradar opened 8 years ago

abbradar commented 8 years ago

It can be implemented via sample family of functions from StatsBase. Example implementation with sklearn-like interface is here. If it's okay I can make a PR; what holds me from it is that I'm a newcomer and may have just missed an already existing and obvious way to do it.

EDIT: also a nice addition would be to support several arrays simultaneously -- I'll work on this if it's accepted to be useful.

bobbywlindsey commented 8 years ago

I think the sample function in StatsBase doesn't allow a user to specify through which dimensions to take a sample from. So in practice, it's only useful for 1-dimensional arrays.