hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.1k stars 727 forks source link

[question] exporting model to tensorflowjs #474

Closed pedrohbtp closed 4 years ago

pedrohbtp commented 4 years ago

First of all, great work with the amazing library!

I am trying to convert the underlying tensorflow model to tensorflowjs to be able to use the model on the browser. But I could not make the conversion work

I followed this github issue to create the necessary tensorflow files using the code:

def generate_checkpoint_from_model(model, checkpoint_name):  
        tf.saved_model.simple_save(model.sess, checkpoint_name, inputs={"obs": model.act_model.obs_ph}, outputs={"action": model.action_ph})

Then I try to transform the model using tensorflowjs_converter

tensorflowjs_converter --input_format=tf_saved_model test/ web_test

However, it gives me the following error:

Unable to lift tensor <tf.Tensor 'loss/action_ph:0' shape=(?,) dtype=int32> because it depends transitively on placeholder <tf.Operation 'loss/action_ph' type=Placeholder> via at least one path, e.g.: loss/action_ph (Placeholder)

I created the following colab notebook with the error so you can try it.

Does anyone knows how to make this conversion work?

Thank you for the help

Miffyli commented 4 years ago

You are trying to save the action placeholder used in PPO training (part of PPO agent), but for inference you only need the trained policy and its placeholders (model.act_model). The code on colab runs without errors by changing call to simple_save to this:

tf.saved_model.simple_save(model.sess, checkpoint_name, inputs={"obs": model.act_model.obs_ph},
                                   outputs={"action": model.act_model._policy_proba})

The value of _policy_proba depends on the environment/algorithm.

Thanks for offering the colab code, btw! It really made debugging this very easy :)

pedrohbtp commented 4 years ago

Thanks a lot for taking the time to look into it, @Miffyli ! I am glad the colab helped.

araffin commented 4 years ago

now that we have examples on how to export trained agent to pytorch/java/tfjs could be good to add links to the issues in the documentation, no?

Miffyli commented 4 years ago

@araffin I had a similar thought. I can write up a summary of the results of the issues and bit of instructions on manually creating the network and then loading the parameters. Export that way is not perfect as it does not store the graph / how layers are connected, though, but it is a start.

pedrohbtp commented 4 years ago

Adding it to the documentation would be great! I bet it would help a lot other people in the same situation.

sbhadade commented 4 years ago

@pedrohbtp @araffin @Miffyli Thanks for your work in githib. How to use your stable.baselines agent model and environment in production using keras or tf for live time series data. Any working explanation. Thanks
Swapnil

Miffyli commented 4 years ago

@sbhadade

There are no examples on this, other than the documentations on exporting models and example on how to use trained model without training

pedrohbtp commented 4 years ago

@Miffyli I deployed this snake AI running on tensorflowjs after exporting it following the guidelines here. In the browser it is used only for inference and not for training as discussed previously. If someone wants a reference of it, here it is: https://www.pedro-torres.com/snake-rl/

Miffyli commented 4 years ago

@pedrohbtp

Thanks for sharing this! It is always refreshing to hear things worked out and there weren't bigger bottlenecks after the export :). Indeed looks like the agent is still intact in the javascript model.

araffin commented 4 years ago

Looks cool, when you think your project is ready you can also submit a PR to add it to the project page ;)

pedrohbtp commented 4 years ago

@araffin Sounds good! What is in your mind on how to include it? Would you want to add in the exporting section of the documentation?

araffin commented 4 years ago

@pedrohbtp I was thinking about the project page: https://stable-baselines.readthedocs.io/en/master/misc/projects.html because the tensorflow js export is already present in the exporting section.

mleonrivas commented 4 years ago

@pedrohbtp I'm trying to do the same you were doing here. With a small difference: I have the pkl file generated after running the training command from the terminal (baselines.run .... --save_path=./myModel.pkl)

Is there any way to transform the pkl file to something tensorflowjs_converter would understand?

araffin commented 4 years ago

It seems you are using baselines and not stable-baselines...

roark commented 4 years ago

@Miffyli simple_save code worked perfectly for me when converting PPO2 and MlpPolicy.

However, when I try to run tensorflowjs_converter on a PPO2 and MlpLstmPolicy I receive the following error:

 File "/Users/zeus/.local/share/virtualenvs/wt-agent-trainer-818OHLeT/lib/python3.7/site-packages/tensorflow_core/python/ops/op_selector.py", line 413, in map_subgraph
    % (repr(init_tensor), repr(op), _path_from(op, init_tensor, sources)))
tensorflow.python.ops.op_selector.UnliftableError: A SavedModel signature needs an input for each placeholder the signature's outputs use. An output for signature 'serving_default' depends on a placeholder which is not an input (i.e. the placeholder is not fed a value).

Unable to lift tensor <tf.Tensor 'output/Softmax:0' shape=(1, 2) dtype=float32> because it depends transitively on placeholder <tf.Operation 'input_1/dones_ph' type=Placeholder> via at least one path, e.g.: output/Softmax (Softmax) <- model/pi/add (AddV2) <- model/pi/MatMul (MatMul) <- model/Reshape_2 (Reshape) <- model/concat_1/concat (Identity) <- model/mul_4 (Mul) <- model/Tanh_3 (Tanh) <- model/add_2 (AddV2) <- model/mul_3 (Mul) <- model/Tanh_2 (Tanh) <- model/split_3 (Split) <- model/add_1 (AddV2) <- model/add (AddV2) <- model/MatMul_1 (MatMul) <- model/mul_1 (Mul) <- model/sub_1 (Sub) <- model/Squeeze_1 (Squeeze) <- model/split_1 (Split) <- model/Reshape_1 (Reshape) <- input_1/dones_ph (Placeholder)

Any ideas on how I can best resolve or debug the error better?

Thanks for the great library!

Miffyli commented 4 years ago

I am not familiar with saving models like this, but sounds like you are missing some of the placeholders/inputs to the graph. LSTM policies take in more inputs than non-lstm policies (mask and state), which you need to account for.