Open matanzuckerman opened 5 years ago
You added shuffling for the case, where test dataset is determined as a part of full data set. In jupyter notebooks this case was not considered. Note, in case of cross-validation (which is more or less similar) shuffling is performed. Do you actually work with splitting full data set? I suggest to use one-time cross validation for this goal.
@semion1956 Hi There is a chance I will split full dataset to train-test. Usually I will do cross-validation but without the shuffling it won't work.
@matanzuckerman Hi. I only want to note, that cross-validation is started from merging train and test data sets and shuffling of resulting "full" data set (of course, without actual changes in original data)
I added random.shuffle in line 101. without the shuffeling we saw the problem of the discrepancies between the jupyter notebook and the script.
Thanks