-
### Question
Do the evaluations influence the training in the SAC-Agent?
From my understanding of the code and the documentation, i would answer the question with no. But my experimental results (de…
ghost updated
2 years ago
-
**Describe the bug**
Even though some parameters are set, they are not fed into the CQL algorithm.
**To Reproduce**
```python
import d3rlpy
import gym
import d4rl
dataset, env = d3rlpy.dat…
-
On an Ubuntu 20.04 machine with RTX 3090 GPUs (each has 24G of memory), and having installed ManiSkill2 (and ManiSkill2-Learn) as per both READMEs, I am training PPO to get a sense of the task difficu…
-
![150096176-4c83e131-78a1-40d5-9d9c-31ff02606f00](https://user-images.githubusercontent.com/83230521/159255192-11c15360-b79a-4174-98ec-0cb7fe93a00b.png)
-
- [ ] I have marked all applicable categories:
+ [ ] exception-raising bug
+ [ ] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [x] ne…
-
hf_T5:
torch.rsqrt, pow, acc_ops.to,torch.isinf, any, float, type_as,
hf_GPT2:
acc_ops.split, torch.where, type
soft_actor_critic:
exp,torch.functional.broadcast_tensors
-
Recently, I finished reading this repo code. And I found that the entropy bonus of a state value from SAC is only added at the last output step.
This routine let me can't help but thinking:
If t…
-
Comment
-
PC Configuration: Ubuntu 20.04, RTX 3060, RAM 64 gb, Cuda 11.4, Nvidia driver 470.141.03.
Note: For Cartpole and Ant simulation, same command works but not for Anymal.
I was trying the demo run…
-
## Problem Description
Soft Actor-Critic (`sac_continuous_action.py`) just doesn't work. I have tried different envs, it doesn't matter
## Current Behavior
```
python cleanrl/sac_continuous_…