-
First of all, thank you for providing these great baselines!
I can train the policies for the various algorithms (PPO1/PPO2/TRPO) and see that average reward increases and loss decreases, but is th…
-
I am still struggling with the implementation of a recurrent policy. The trick from [#1](https://github.com/Khrylx/PyTorch-RL/issues/1) worked and I can now start running my RNN GAIL Network. But no m…
-
- [x] check and fix C51 [deaab73]
- [x] check qrdqn [deaab73]
- [ ] check iqn
- [ ] check and fix Rainbow
- [ ] check on-policy buffer sampling
- [ ] check function `discounted_sum`
- [ ] check …
-
I am trying to run a simple Tensorflow example on CartPole with the LSTMNetwork. It appears a critical member is missing from the class. I also get the same error when using the GRUNetwork.
> Error…
-
1、看完了rl 虽然TRPO和PPO还是懵逼
2、看完了吴恩达的第四节
3、别的就没有学习了
-
Hi,
Thanks a lot for this extremely useful implementation.
I wanted just to ask what is the ZFilter class, is it used to standardize the observed state according to the running mean and std of t…
-
Hi, I have just installed rllab envirtonment, and I run the example code trpo_cartpole_pickled.py successfully. And get the log file "debug.log params.pkl progress.csv variant.json". And when I am …
-
after `pip install -r requirements.txt`,
I ran
` python train.py --config configs/maml/halfcheetah-vel.yaml --output-folder maml-halfcheetah-vel --seed 1 --num-workers 8`
but progress doesn't i…
-
- MARL:
- [x] MADDPG
- [x] MASAC 1346949
- [x] IQL
- [x] VDN
- [x] Q-MIX
- [x] Qatten ad8be31
- [ ] MAPPO
- [ ] COMA
- [ ] QTRAN-alt
- [x] QTRAN-base 4c45ba0
- [x] QPL…
-
Because i want to use ppo2 or trpo to sample a random policy and use gail to imitation learning.
Can you share some idea with me?
Your help will be my great honor.