rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.45k stars 550 forks source link

"Forgetting Learning" using SAC in a Drone environment on PyRep #93

Open kaelgabriel opened 4 years ago

kaelgabriel commented 4 years ago

Hi guys, I'm using a drone environment that a friend of mine made using V-REP (and know we are using PyRep).

We got it to converge using Tensorforce PPO and also RL-ADVENTURE2 SAC. I could not make it work on "softlearning" because of the way PyRep uses process/threads.

So it seems to me that the env is legit.

Anyways, when I use RLKIT, even tweaking hyperparameters, things like this happen:

image

Anyone ever saw this happening?

Thanks for your time.

vitchyr commented 4 years ago

Hmm it's hard to say. What hyperparameters are you tuning? And are you using the same hyperparameters as in RL-ADVENTURE2?

kaelgabriel commented 4 years ago

@vitchyr, thanks for your answer.

I've figured it out now that is a GPU problem. My tensors in GPU and CPU are different, even casting all to float64 (changing some parts of the library).

So I will have to use CPU for this problem, since I don't have time to continue debugging.

Thanks