cyanrain7 / TRPO-in-MARL

MIT License
186 stars 49 forks source link

The Script code runs wrong when applying the HATRPO algorithm with 【rnn】 network. #14

Closed junjunjun-learner closed 2 years ago

junjunjun-learner commented 2 years ago

Hello, I try to run your code with the hatrpo algorithm and 【rnn】 network. Specially, I add "--use_recurrent_policy" in both scripts: train_smac.sh and train_mujoco, modify the algo='hatrpo'. However, both the scripts code go wrong and return errors as below:

RuntimeError: the derivative for '_cudnn_rnn_backward' is not implemented. Double backwards is not supported for CuDNN RNNs due to limitations in the CuDNN API. To run double backwards, please disable the CuDNN backend temporarily while running the forward pass of your RNN. For example: with torch.backends.cudnn.flags(enabled=False): output = model(inputs)

cyanrain7 commented 2 years ago

Thanks for your question, we're not implement rnn-based agent for HAPPO and HATRPO. You can modify the relevant modules (replay buffer and runner and so on) for rnn network.