Question - Episodic environment with only one step

PKU-MARL / HARL

Official implementation of HARL algorithms based on PyTorch.

484 stars 59 forks source link

Hi.

First of all, I want to thank you for sharing this repository so that others can benefit from your expertise on RL.

I have a question about the possibility of creating an episodic environment with only one step. Based on what I can see in the code, the action chosen in the first step is always chosen at random. This suggests that the agent makes a decision based on knowledge from previous episodes rather than on actual observations.

Hence, is it possible to create such an environment, or am I misinterpreting the code?

Thank you in advance for your answer.

Best regards.

PKU-MARL / HARL

Question - Episodic environment with only one step #33