Closed RajK853 closed 3 years ago
Hi, there's a shuffle for each epoch for non-streamed corpora when training batches are created in the training loop here:
Edited to add: I added the shuffle
option to Corpus
when we were considering some other alternatives for the streamed corpora, and it didn't seem like a problem to go ahead and leave it in as an option for other uses of Corpus
. The default training loop has always shuffled the training examples for finite corpora for each epoch.
Ah now it makes sense š Thanks for the clarification.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
While going through the source code of the
spacy.Corpus.v1
, it seems like reference docs are never shuffled.https://github.com/explosion/spaCy/blob/master/spacy/training/corpus.py
Registration part of Corpus.
Default value of shuffle in the Corpus class:
It seems the
shuffle
parameter cannot be passed to thespacy.Corpus.v1
from the config file and as a result theshuffle
attribute is always set toFalse
by default resulting in the reference docs not being shuffled.