Closed dfulu closed 2 weeks ago
Yea I thought that might be the case. I did a quick check of this, by running an experiment with only 1 batch in an epoch, and printed out teh files names being used. From what I remember the same file names were used. SO i think this means the shuffling is only done once. We should find this out though
I do think this is fine actually
I think there may have been a possible mistake in experiments/india/004_n_training_samples
The experiment states that the
limit_train_batches
option was used to limit the train batches and conduct the experiment. However, based on this discussion it suggests that that would just limit the number of batches shown on each training epoch. I think it means it shuffles all the batches and then limits them rather than limiting the batches and then shuffling them. This suggests the experiments with only 100 samples wouldn't be shown only 100 samples. Instead it sees a different 100 samples each epoch. The point of the experiment was to see if having more training examples was worth it, but if I'm right about the way lightning limits batches then the experiment doesn't address that