Closed Gregwar closed 1 year ago
Hello, that would be a valuable extension to SB3 but should be done in the RL Zoo I think (or in an external repo).
Here is how I would see it:
Is that different options or a list of features?
but yes, at the end would be nice to have something like python sb3_export.py --algo a2c --env MyEnv -i path_to_model
or in the case of the RL Zoo: python sb3_export.py --algo a2c --env MyEnv -f exp_folder --exp-id 1
(it fins the experiment folder automatically and also loads the normalization and other wrappers if needed)
I would like some questions to be answered about how do you imagine the feature.
what are your questions?
My list is a list of features.
About questions, I mostly wanted to know what was the recommended direction for this. Making it some contrib to RL Zoo indeed looks like the way to go, since the export can be achieve "from the outside" or SB3.
Making it some contrib to RL Zoo indeed looks like the way to go, since the export can be achieve "from the outside" or SB3.
Feel free to open a draft PR there if you want to discuss it in more details ;)
I started working on that in the RL Zoo, I will indeed open a draft PR soon, even if it will not support everything it can be used as a base for discussions
I have a very short term goals of embedding inferences in our humanoid robots so I will also be the first user
Ok, I started a draft PR
https://github.com/DLR-RM/rl-baselines3-zoo/pull/228/
enjoy.py
, since I didn't wanted to duplicate or factor out the environment loading logic, to start the export pass --export-cpp target_directory
to enjoy.py
(supplementing all the usual flags to load your model)cpp/
directory in the RL Zoo that is a "skeleton" of C++ project, when the export starts the directory is created from this "template"--export-cpp
for multiple environments, it will add/update the classes in the target projectTo test that it indeed works, I added an option to generate a Python binding while building, so that we can directly use Python's gym to test it. Here are the steps:
CMAKE_PREFIX_PATH
pybind11
, with for instance apt-get install python3-dev python3-pybind11
DQN
with CartPole-v1
and then run something like: python enjoy.py --env CartPole-v1 --algo dqn -f logs/ --export-cpp predictors/
cd predictors
mkdir build/
cmake -DBASELINES3_PYBIND=ON ..
make -j8
libbaselines3_models.so
, that is where your predictors arebaselines3_py.cpython-36m-x86_64-linux-gnu.so
, this can allow you to test that it works using python envFrom here you can test with such a script:
import gym
from baselines3_py import CartPole_v1
cp = CartPole_v1()
env = gym.make('CartPole-v1')
obs = env.reset()
while True:
action = cp.predict(obs)
obs, reward, done, info = env.step(int(action[0]))
env.render("human")
if done:
obs = env.reset()
obs = gym.reset()
It should show you the CartPole, using the C++ built library for prediction. You can also of course build without the Python binding and use the library from your C++ code.
An example can be found in predict.cpp
that is build as binary if you set BASELINES3_BIN
to ON
(hard coded for CartPole-v1
).
Maybe you are just looking for something like this... https://onnxruntime.ai/docs/
@stalek71 thanks for the lead, however the problem here is not exactly to save a PyTorch model to a C++ executable file and load it (which is explained in [1]), but to export SB3 models (at least in first place model.predict()
) to C++
More specifically, it implies:
This is not very complicated matter but it requires some specific knowledge of how underlying RL agents works. So I guess it's good to have the whole process automated.
(The use case is: I want the thing running in a robot without running any Python code (because it is embedded real-time robotics application).)
Thanks for the PR =)
The export is initiated through enjoy.py, since I didn't wanted to duplicate or factor out the environment loading logic, to start the export pass --export-cpp target_directory to enjoy.py (supplementing all the usual flags to load your model)
sounds good.
So far only action inference is provided
I think we should keep the first version as simple as possible (for instance limiting ourselves to a subset of models or action spaces)
, they are embedded in the binary as ressources using CMRC
how much more difficult is it to just give a path to torch.jit.load()
?
Go in predictor and build:
this could be even automated, no?
from baselines3_py import CartPole_v1
I would rather keep the name of the algorithm (or concatenate it with the name of the env) to avoid confusion.
This should produce a libbaselines3_models.so, that is where your predictors are
Do you have also an example cpp file to show how to use that shared lib?
for the name, we will discuss it too (whether it should be baselines3
or sb3
or stablebaselines3
, I would lean towards the last two ;))
action = cp.predict(obs)
This is not consistent with SB3 API, but I think it's fine as it is target towards deployment (and only RecurrentPPO requires a hidden state).
how much more difficult is it to just give a path to
torch.jit.load()?
It is not, I really can make it this way, dropping the dependency with CMRC or making it optional
this could be even automated, no?
Enabling the Python binding is more like a test than a real use case. There is likely not that much performance boost because most of the computation is actually achieved by PyTorch, it's just about embedding it in C++.
But yes, we could automate the build and run of Python tests on the top of library for unit test purposes, CI and so
Do you have also an example cpp file to show how to use that shared lib?
So far just the simple: https://github.com/Gregwar/rl-baselines3-zoo/blob/export_cpp/cpp/src/predict.cpp
, dropping the dependency with CMRC or making it optional
Less dependencies is usually better ;)
Enabling the Python binding is more like a test than a real use case. T
I meant automating the build of the shared lib, but I probably misunderstood what you wrote.
we could automate the build and run of Python tests on the top of library for unit test purposes
this would be nice to have at least some test on the CI (nothing too complicated)
So far just the simple:
thanks =)
Closing that one in favor of another one that gonna be opened soon (per discussion with Grégoire).
Question
Hello, I am using SB3 to train some model where I want the inference to run on embedded robots using C++. I had a look at PyTorch documentation and doing this is not very hard "by hand", but the process could indeed be automated.
I could maybe contribute, but I would like some questions to be answered about how do you imagine the feature.
In my opinion, there is more that can be done that simply documenting it, because we could help to automate the observation and action rescaling and normalizing, in a way such that methods like
model.predict
gets converted seamlessly to C++.Here is how I would see it:
.pt
) from the different agents implemented in SB3What do you think ?
Additional context
-
Checklist