rctn / sparsecoding

Reference sparse coding implementations for efficient learning and inference.
BSD 3-Clause "New" or "Revised" License
17 stars 4 forks source link

Iterate through all of dataloader each epoch. #65

Closed belsten closed 6 months ago

belsten commented 7 months ago

Compute epoch energy as the average of the batch energies.

9q9q commented 7 months ago

Thanks Alex! Just to confirm I understand what the change is: previously, each epoch would just use the next batch from the dataset, and losses were computed per batch. So during training, the model only saw each example once. Change is each epoch uses the entire dataset, and losses are computed for the whole dataset at each epoch.

It makes sense that the old method would finish very fast, because it only processes len(dataset) samples rather than n_epoch*len(dataset) samples. I'm guessing the loss and filters learned were also not as good?

belsten commented 6 months ago

Almost. The previous incorrect method saw batch_size*n_epoch samples while the new method sees n_epoch*len(dataset) samples (like you said). And yes, the losses returned previously were only over a single batch and now they are over the whole dataset. The previous method would make sense if we called n_epochs n_batch_updates instead but alas we did not.

In my experience, the loss and filters could be fine with the old method, you would just have to make n_epochs large.