keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.85k stars 19.44k forks source link

Meta-RL #5343

Closed jpeg729 closed 7 years ago

jpeg729 commented 7 years ago

I would like to experiment with meta-RL, but the central idea involves adding the previous prediction and its quality to the input vector at each timestep.

This means I'll be writing my own train loop and in the middle I will have something like this.

loss = model.train_on_batch(X[t],Y[t])
predicted = model.predict_on_batch(X[t])
X[t+1]["previous"] = predicted
X[t+1]["loss"] = loss

But there are two problems with that.

  1. It runs the prediction step twice, once internally for the training step, and once to give me the prediction
  2. It gives me a scalar loss, not a vector representing the individual loss for each prediction.

So basically, I would like to have a single function that does everything in one go.

vector_loss, prediction = model.train_and_predict_on_batch(X,Y)

unrealwill commented 7 years ago

Hello,

You will probably have to write your custom update function, to achieve what you want to do. You can have a look at what someone did, in a somewhat similar fashion to improve GAN training.

https://github.com/fchollet/keras/issues/5312

jpeg729 commented 7 years ago

If there were the following functions, then I'd be able to experiment in myriad ways.

prediction = model.run_predict()
losses = model.get_loss_vector()
gradients = model.get_gradients()
etc.

I suppose that using functions like these would probably invalidate most of the optimisations needed to run stuff efficiently on a gpu. But it would be top for running stuff on a cpu.

Maybe I should just learn Tensorflow.

cassianokc commented 7 years ago

Hi @jpeg729 , did you check https://github.com/matthiasplappert/keras-rl ?

jpeg729 commented 7 years ago

Yes, but only briefly.

The readme lists A3C as still experimental, but meta-RL requires a modified version of A3C that adds the previous action and its value to the input at each timestep.

I'll go with https://github.com/awjuliani/Meta-RL an existing tensorflow implementation of meta-rl.