-
Hi, impressive work! Your paper evaluated the tasks with ACT and diffusion policy and some rl methods,but I haven't found those algorithms in this codebase. Do you plan to open this part of code?Since…
-
I have recently been using the OpenSpiel codebase for a research project and need to modify the reward settings in the games. However, I found that the rewards are encapsulated within pyspiel.so, maki…
-
I have tried some RL algorithms, but all of them do not success to land.
Do you have a successful case? Like, use PPO with some specific hyper-parameters.
-
### What happened + What you expected to happen
I am trying to run a regression test on the cartpole example and am running into the issue below.
```
rayvenv) shpa7847@UCB-TDLQ372645 bsk_rl % pyth…
-
I can obtain episode reward mean from the train result, but the fluctuation is very large, and it is difficult to judge when to stop the training iteration, so I hope to use the result of evaluate.
…
-
https://github.com/Alescontrela/AMP_for_hardware/blob/bfb0dbdcf32bdf83a916790bddf193fffc7e79b8/rsl_rl/rsl_rl/algorithms/amp_ppo.py#L235
When using state normalization, the `sample_amp_expert` tuple…
-
Hi, first of all, great work. This is a very useful library for research on RL and NLP. It will be very helpful if it's possible to add off-policy RL methods like Q-learning, SAC, etc. along with benc…
Div99 updated
5 months ago
-
When training and testing RL strategies (arti_mani/algorithms/rl_iam/sac_train_segpts_PNfeat.py & arti_mani/algorithms/rl_iam/sac_eval_segpts_PNfeat.py), you can choose the **frameweight_sample** met…
-
- Value based RL
- [ ] DQN
- [ ] Rainbow DQN
- [ ] [CQL](https://sites.google.com/view/cql-offline-rl)
- Value based + Policy based RL
- [x] DDPG
- [ ] [TD3](https://spinni…
-
Our current baseline RL algorithm is DQN (more accurately it is DDQN). Named algorithm uses epsilon-greedy policies to at least have a chance of fully investigating environment in question. Using epsi…