PKU-MARL / HARL

Official implementation of HARL algorithms based on PyTorch.
484 stars 59 forks source link

Question - Episodic environment with only one step #33

Closed jovanchog closed 7 months ago

jovanchog commented 7 months ago

Hi.

First of all, I want to thank you for sharing this repository so that others can benefit from your expertise on RL.

I have a question about the possibility of creating an episodic environment with only one step. Based on what I can see in the code, the action chosen in the first step is always chosen at random. This suggests that the agent makes a decision based on knowledge from previous episodes rather than on actual observations.

Hence, is it possible to create such an environment, or am I misinterpreting the code?

Thank you in advance for your answer.

Best regards.

Ivan-Zhong commented 7 months ago

Hello, thank you for acknowledging our work.

The action chosen in the first step is not chosen at random. Instead, it is based on the first observation after the environment resets. So I guess there is some misinterpretation here.