keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.34k stars 19.39k forks source link

Add `on_epoch_begin` to `utils.Sequence` #19749

Open Cerno-b opened 1 month ago

Cerno-b commented 1 month ago

Could we have an on_epoch_begin() function in utils.Sequence?

There is already a on_epoch_end(), but I can see a use case for the begin equivalent:

Let's say I have a data generator that has a fixed internal set of data and maybe does some augmentation. If I want to ensure that all patches have been used in an epoch (say, for debugging reasons), I would add a count variable to the generator, set it to 0 on on_epoch_begin(), increment it on __getitem__() and check against my internal data size on on_epoch_end().

I could achieve this currently by setting count to 0 on on_epoch_end(), but this won't work, because keras may call __getitem__() before the first epoch actually starts (e.g. for internal preprocessing).

So the workaround that I currently use is to create a callback object, whose on_epoch_begin() calls a reset() function I put in my generator object. It's ugly, but works.

It would be much nicer to have a proper on_epoch_begin()

fchollet commented 1 month ago

It is certainly feasible. It would need to get added not just in PyDataset, but also in DataAdapter and then called in EpochIterator as well as all EpochIterator subclasses that are backend-specific. Are you able to open a PR for this feature?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.