Closed 1900360 closed 1 year ago
Hi @1900360 support for continuous action spaces is not implemented. Although, I think this should be fairly easy to implement.
I tried changing the code but I get the following error:
`Traceback (most recent call last): File "C:\Users\1900.conda\envs\tf\lib\multiprocessing\connection.py", line 312, in _recv_bytes nread, err = ov.GetOverlappedResult(True) BrokenPipeError: [WinError 109] 管道已结束。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/desktop/lunwen_dabao/xinsuanfa0912/lstm_ppo_continue/train.py", line 35, in
An EOFError usually indicates that something on the environment side did not work. For debugging purposes I recommend to run enjoy.py with an untrained model, because it does not use threading and therefore the actual exceptions are shown.
I have changed your code to work in a continuous environment, such as Pendulum-v0, but the reward curve cannot rise during training, as shown below:
Also I have attached the changed code behind. Please take the time to check if the code is correct: lstm_ppo_continue.zip
Hope you can help me with this, after all I've been stuck with this code these days :(
Hi @1900360 I don't have the time to investigate your code. I can recommend this repo to look up the vital changes to allow for continuous actions. https://github.com/PG649-3D-RPG/neroRL/tree/develop Also have a look at https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/ppo_continuous_action.py
One important thing is to change all activations form relu to tanh. Reward and observation normalization is also very important.
Hope this helps.
I close this issue as it seems stale. Please reach out again if you have more questions.
I have test this repo on 'MountainCarContinuous-v0', but it could't work. What needs to be modified here to run the continuous environment?