-
hi, it's really great that facebookresearch is considering provide a library for reinforcement learning research.
it would be very helpful if the library provide the low-level functionality rather …
-
Thanks a lot for maintaining Rust bindings! I have built a very simple actor critic algorithm in Rust, and it works like a charm. However, how can I emit events for tensorboard to get fancy graphs?
…
-
thanks for your nice work
but Chainer is the first time use it
-
序列化有哪些注意事项呀,复杂的对象一般要如何处理,在agent上直接注释就变成这样了
-
I want to make a furuta pendulum. Like [This](https://www.google.com/imgres?imgurl=https%3A%2F%2Fwww.researchgate.net%2Fpublication%2F227017529%2Ffigure%2Ffig1%2FAS%3A302327165669385%401449091821542%2…
-
Hello, I need to make SacAgent work with discrete action, so try to implement GumbelSoftmax parameterization trick by re-defining the relevant classes. However, the calculation of `agent.train(experie…
-
Hi
I have been training with a custom robot based on the a1 example. I repeatedly get the following error, random number of seconds into the training:
```
Traceback (most recent call last):
…
-
This issue grew out of the discussion in
https://github.com/thu-ml/tianshou/pull/950#discussion_r1342174137_
## Summary
Currently, all policies take actor/critic/critic2 optimizers t…
-
Hello!
I noticed that the maximum eposides can be controlled by MAX_EPISODES during training, and EVAL_INTERVAL determines the evaluation intervals; however, the evaluation process seems to determi…
-
I attempted to implement softlearning with the usage mujoco210, but it appears to be unsuccessful. Is there currently an incompatibility issue between softlearning and mujoco210?