-
Hello
I am amazed by your work. I am wondering if you tested the Sokoban's game on the standard RL method (Q learning, A2C, ec), and wondering if you have success rate for this kind of game?
dikke updated
3 years ago
-
Hi, XTuner Team
Could you please add a citation for the source of the Ray+vLLM-based RLHF architecture - OpenRLHF, such as in the README.md file: https://github.com/InternLM/xtuner?tab=readme-ov-fi…
-
### Question
I'm looking for a solution to this error.
`
[INFO]: Base environment:
Environment device : cuda:0
Physics step-size : 0.005
Rendering step-size : 0.02
Environment …
-
Hi,
Thanks for your package and the article along.
Unfortunately, I am not able to test your package, receiving the following error after issuing command python ddpg.py in gym_torcs directory:
…
-
First, thank your for the code related to paper `Discrete and Continuous Action Representation for Practical RL in Video Games`.
Second, according to your code, all of action spaces of the environ…
-
Is there a way to save weights to a file and reload them later? For instance, in the car example there is ui.cpp which lets the user control the car, and car.cpp appears to train it. I am assuming may…
-
AGENT NAME: A3C
1.1: A3C
TITLE CartPole
layer info [20, 10, [2, 1]]
layer info [20, 10, [2, 1]]
{'learning_rate': 0.005, 'linear_hidden_units': [20, 10], 'final_layer_activation': ['SOFTMAX', …
-
I'm trying to differentiate the MJX step function via the autograd function `jax.grad()` in JAX, like:
```
def step(vel, pos):
mjx_data = mjx.make_data(mjx_model)
mjx_data = mjx_data.replace(q…
-
## Motivation
Plan for the doc revamp
- [x] A 0-to-1 tutorial or Getting started #861
- [x] A tutorial on building a custom env #911
- [ ] A tutorial on model ensembling #876
- [x] A tutorial…
-
Instead of using different default arguments for different algorithms, create config files and load arguments from there.