Closed hardianlawi closed 5 years ago
Also, in regards to
P.S. I only changed OneCycleLR, but the same changes can be made to LRFinder as well. I haven't pushed the changes because I want to make the LRFinder to support a generator for validation_data.
Currently, I randomly sample one batch of validation data from the full validation set each time I test the loss value. This can show some stochastic noise cause the loss may reduce simply cause the validation samples are "easier" or "harder" for the model.
In your opinion, would it be better to cache a single validation batch, or maybe a few batches (say 10 fixed batches so that all classes would appear at least a few times) to test the loss. I haven't checked recently what Fast.ai does in this regard, so we could emulate them if need be.
Now that you mentioned it, I don't see the reason to use validation_data
at all. From my understanding, the purpose of LRFinder
is to find a "good" learning rate such that the training doesn't diverge and this can be done by observing the change in the training loss with respect to the learning rate.
I believe Fast.ai uses the validation loss as they want to find out at what point the model diverges on unseen data, as NNs would generally be able to reduce train loss, even with surprisingly high lrs.
I really should check out Fast.ai's current implementation of the lr_find() method, but I have a somewhat large backlog to deal with as well.
Should I simply merge and make the formatting corrections myself?
Hi,
I have some deadlines to chase, so it would be great if you don't mind doing it.
Also, in regards to the loss, I checked the paper and apparently they are using validation_loss
.
Therefore, my suggestion is to cache a few batches to use for validation, and then renew the cache after certain steps. I will try to do this since I need it for my project.
I'm working with Colab with fit_generator
instead of fit
, and the code failed for me (the keys weren't in the dictionary). Could make it work by again re-supplying some of the params (eg. batch_size
).
Some arguments do not need to be manually specified since they can be retrieved from the
.fit
params. Feel free to merge it if you think this could be beneficial.For the python formatter, I used
pylint
which is why it seems like there are a lot of changes, but most of them are just format changes.P.S. I only changed
OneCycleLR
, but the same changes can be made toLRFinder
as well. I haven't pushed the changes because I want to make theLRFinder
to support a generator forvalidation_data