Closed udibr closed 9 years ago
By looking at the code, I'd disagree. Have you confirmed this with experiments?
So DataStream
uses iteration scheme here
https://github.com/arasmus/ladder/blob/master/run.py#L164
which is set to ShuffledScheme
here
https://github.com/arasmus/ladder/blob/master/run.py#L133
That shuffles labeled samples so SGD should work in that sense.
Do you agree?
agree
When balanced_classes=True in line https://github.com/arasmus/ladder/blob/master/run.py#L154 the examples from each class are add one after the other however there is no additional shuffling of
i_labeled
the only shuffling (withdseed
) is on the entire data set insetup_data
but thenmake_datastream
is called and it sort outs the labeled examples from each class and undo the shuffling.This can reduce SGD optimization.