openclimatefix / PVNet

PVnet main repo
MIT License
21 stars 5 forks source link

Experiment india/004 potential bug #228

Closed dfulu closed 2 weeks ago

dfulu commented 5 months ago

I think there may have been a possible mistake in experiments/india/004_n_training_samples

The experiment states that the limit_train_batches option was used to limit the train batches and conduct the experiment. However, based on this discussion it suggests that that would just limit the number of batches shown on each training epoch. I think it means it shuffles all the batches and then limits them rather than limiting the batches and then shuffling them. This suggests the experiments with only 100 samples wouldn't be shown only 100 samples. Instead it sees a different 100 samples each epoch. The point of the experiment was to see if having more training examples was worth it, but if I'm right about the way lightning limits batches then the experiment doesn't address that

peterdudfield commented 5 months ago

Yea I thought that might be the case. I did a quick check of this, by running an experiment with only 1 batch in an epoch, and printed out teh files names being used. From what I remember the same file names were used. SO i think this means the shuffling is only done once. We should find this out though

dfulu commented 2 weeks ago

I do think this is fine actually