Open Kagandi opened 6 years ago
We don't have it now. This is a great feature request!
For one of my own projects I have implemented a cross-validation and kfold that works with turicreate.
@Kagandi, Thank you - we will definitely have a look. Feel free to submit this as a Pull Request.
will you add cross_validation.KFold in turicreate or not? @igiloh @Kagandi @znation @srikris @hoytak @afranklin
Not sure why this was closed. It's a reasonable feature request. Reopening.
We still don't have cross validation support. However we did just add a shuffle
method for SFrame. That should make it simpler to do cross validation yourself.
To do k-fold cross validation: call shuffle
on your SFrame then divide it into k equal segments.
Here is a function I wrote to do cross validation:
def get_cross_validation_generator(sf, k):
'''
Parameters
----------
sf : SFrame
The SFrame on which to do cross validation
k : int
The number of folds
Returns
-------
out : generator
The generator yields a tuple with two members. The first
member of the tuple is the train set SFrame. The second member
is the test set.
'''
sf = sf.shuffle()
fold_size = len(sf) // k
for i in range(k-1):
test_set_start = i * fold_size
test_set_end = (i+1) * fold_size
cur_test = sf[test_set_start:test_set_end]
cur_train = sf[:test_set_start] + sf[test_set_end:]
yield cur_train, cur_test
# Add any left over portion to the final test set
final_divide = (k-1) * fold_size
yield sf[:final_divide], sf[final_divide:]
Here is an example of using it:
# Test get_cross_validation_generator
import turicreate as tc
sf = tc.SFrame({'a': range(11)})
for train, test in get_fold(sf, 5):
print(train)
print(test)
print("\n\n")
In the previous version of turicreate (graphlab-create-2.1) were a cross validation module that included cross_validation and KFold. I wasn't able to find them anywhere in the current documentation or code. It would be great to have cross validation and KFold as part of Turi.