Closed harshakokel closed 4 years ago
My guess is this line is set to false: https://github.com/richardrl/rlkit-relational/blob/e986f15a21e9ee54d03eea654f55c4e587cb9263/rlkit/torch/sac/sac.py#L107
Set a breakpoint at that line to check. If it's false, you should make sure the class is initialized with the "use_automatic_entropy_tuning=True".
Added breakpoint in sac.py
. Turns out sac was not being used. But that led me to look into use_automatic_entropy_tuning
intwin_sac.py
. Figured out that MPI
was not installed in my python env.
After running the mpirun -np 35 python examples/relationalrl/train_pickandplace1.py
for a few hours, logs stopped updating. I can see that all the processes are still running on my machine. But I do not see any log updates. Is this an expected behavior? I did not find any errors in the logs.
After running the
mpirun -np 35 python examples/relationalrl/train_pickandplace1.py
for a few hours, logs stopped updating. I can see that all the processes are still running on my machine. But I do not see any log updates. Is this an expected behavior? I did not find any errors in the logs.
Nevermind, it turns out this was just a problem with nohup command.
Details here : nohup-does-not-work-mpirun
@harshakokel Excellent! Please let me know what results you are able to get and if you have further any further issues.
Hello,
I am trying to follow the ReNN doc to replicate the results and I am getting the following error on running
mpirun -np 35 python examples/relationalrl/train_pickandplace1.py
Can you help me figure out what am I doing wrong?