openclimatefix / PVNet

PVnet main repo
MIT License
15 stars 3 forks source link

Experiment india/004 potential bug #228

Open dfulu opened 1 week ago

dfulu commented 1 week ago

I think there may have been a possible mistake in experiments/india/004_n_training_samples

The experiment states that the limit_train_batches option was used to limit the train batches and conduct the experiment. However, based on this discussion it suggests that that would just limit the number of batches shown on each training epoch. I think it means it shuffles all the batches and then limits them rather than limiting the batches and then shuffling them. This suggests the experiments with only 100 samples wouldn't be shown only 100 samples. Instead it sees a different 100 samples each epoch. The point of the experiment was to see if having more training examples was worth it, but if I'm right about the way lightning limits batches then the experiment doesn't address that

peterdudfield commented 2 days ago

Yea I thought that might be the case. I did a quick check of this, by running an experiment with only 1 batch in an epoch, and printed out teh files names being used. From what I remember the same file names were used. SO i think this means the shuffling is only done once. We should find this out though