-
Hi,
we developed and tested our algorithm OT-TRPO (published at the upcoming NeurIPS2022, you can find the preprint [here](https://arxiv.org/abs/2210.11137)) using stable baselines.
Is there an …
-
...
python3 logging_bio_args.py --total_timesteps=10000000.0 --SEED=4 --timesteps_per_batch=4000 --algo=TRPO
python3 logging_bio_args.py --total_timesteps=10000000.0 --SEED=4 --timesteps_per_batch=8…
-
Hi,
Is there any simple way in rllab to save the optimal learnt policy?
For example, using TRPO, I want to save the already learnt policy, so that I can simply look into the trajectory/path of …
-
While running a TRPO train, after some time (random - anywhere from 15sec to 1min) it kicks with the following:
`Traceback (most recent call last):
File "callback.py", line 196, in
model.lea…
-
Thanks to the OpenAI team for the latest release!
Are there any benchmark results (like Atari score) on PPO and TRPO? DQN has a report here: https://github.com/openai/baselines-results. It's super…
-
https://github.com/whoisthisadam/trpo-practice/blob/6c0fbddbf4d11896c28e7a22223c800c3ad18a22/trpo%20pz5.cpp#L127
to_string() нельзя, надо свою функцию использовать
-
-
-
This will be an especially interesting task since i believe it was originally made for OpenAI gym sessions, which I do not think we should try to cobble together a data structure to spoof a Gym.
We…
-
Traceback (most recent call last):
File "TRPO.py", line 169, in
action = agent.act_and_train(obs, reward)
File "C:\Anaconda3\envs\osim-rl2\lib\site-packages\chainerrl\agents\trpo.py", line 680, in …