wuxiyang1996 / iPLAN

iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
https://arxiv.org/abs/2306.06236
MIT License
32 stars 3 forks source link

Why env.reset() in cmd='step'? #5

Open Captain6606 opened 2 months ago

Captain6606 commented 2 months ago

Dear Author: Why reset the environment on line 151? https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/envs/env_wrappers.py#L139-L151 I look forward to hearing from you.

wuxiyang1996 commented 2 months ago

This means that all threads that run a separate env are terminated, in which case the parallel envs will restart.

I was assuming that you are expecting to keep the terminated envs run until time runs out, right?

Captain6606 commented 2 months ago

I think that's partly true, when the for loop finishes executing, https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/runners/ippo_parallel_runner.py#L166 the statement is executed. That's what I asked in problem four (4#issue) 16 initial states but only 8 outputs. But why initialize the environment here? I thought it might be a test after training, but I didn't see any test code, and I didn't see any output. After that, I tried to replace ob = env.reset() with pass. https://github.com/wuxiyang1996/iPLAN/blob/7e4b79a6083fa7800dbfb3e05bdbff981f40bed2/envs/env_wrappers.py#L151 I found that the program still worked, but a funny thing happened: the same data got different results before and after replacement. I was very confused.

wuxiyang1996 commented 2 months ago

you may schedule a virtual meeting some time with me, my personal information could be found from my profile: https://wuxiyang1996.github.io/

For others who encounter this issue later: I will release the summary of this meeting later.

Captain6606 commented 2 months ago

I have sent an email to you, please check it.