hansbuehler / deephedging

Implementation of the vanilla Deep Hedging engine
GNU General Public License v3.0
235 stars 49 forks source link

Tensorflow 2.11 update breaks use of Keras optimizers, quick fix included. #2

Open hjalmarheld opened 1 year ago

hjalmarheld commented 1 year ago

The current code is incompatible with Tensorflow 2.11 due to updates of the optimizer API:s.

The Adam and RMSprop optimizers no longer has the get_weights() method.

See some info here. A quick fix is to simply pass to the legacy namespace.

This implies changing one row in trainer.py and one row in _trainerserialize.ipynb.

In trainer.py change: optimzier = config.train("optimizer", "RMSprop", help="Optimizer" ) to: optimzier = config.train("optimizer", tf.keras.optimizers.legacy.RMSprop(), help="Optimizer" )

And in _trainerserialize.ipynb equivalently set the optimizer as:

...
# trainer
from tensorflow.keras.optimizers.legacy import Adam
config.trainer.train.optimizer = Adam()
...

Apparently the old optimizers are to be kept indefinitely thus this should be a relatively stable solution.

hansbuehler commented 1 year ago

Much appreciated. I will incorporate in the next update. Is there a 2.11 compatible way for caching an optimizer with its internal state?

NB the main reason I've added this is because I am struggling with AWS SageMaker's propensity to die on me with 504 Gateway timeouts. The machine dies; I have to stop/start/and re-run the whole lot again. Caching at at least allows to continue training at the last caching point. Do you happen to have an idea how to address the issue of AWS dying?

hjalmarheld commented 1 year ago

I haven't dug into the details to be frank, but I found that SciKeras had a similar problem. See discussion here. They seem to since have implemented another approach, with new util functions found here.

As for SageMaker, are you talking about notebook instances? Sounds like it could a RAM issue. Have you tried running it on a larger instance?

hansbuehler commented 1 year ago

Actually I found it. It appears plotting ate up memory.I’ve also coded up TF2.11 serialisation … but not tested yet. Will let you knowThe new branch has also recurrent agents, an initial delta state, and no longer unrolls the main training loop. I’ve also got tensor board support even though I haven’t managed to get the profiler going __Dr. Hans Buehler | @. | http://hans.buehler.londonOn 12 Jan 2023, at 12:50, Erik Hjalmar Held @.> wrote: I haven't dug into the details to be frank, but I found that SciKeras had a similar problem. See discussion here. They seem to since have implemented another approach, with new util functions found here. As for SageMaker, are you talking about notebook instances? Sounds like it could a RAM issue. Have you tried running it on a larger instance?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>