about ego trajectory output head NeuralPlanner

MCZhi / GameFormer-Planner

[ICCV & CVPR Workshop] Learning-enabled Interactive Prediction and Planning Framework for Autonomous Vehicles

https://mczhi.github.io/GameFormer/

MIT License

143 stars 17 forks source link

about ego trajectory output head NeuralPlanner #7

Closed fshi2006 closed 9 months ago

fshi2006 commented 9 months ago

Hello, I have a question for you, you have designed a special output header called NeuralPlanner for ego initial trajectory for planning task, why not just use the highest rated ego trajectory among the level k headers output? Looking forward to your answers, thanks!

fshi2006 commented 9 months ago

My understanding is that this method hope the model could plan k+1 layer ego trajectory based on the scenario of level k. I am not sure if my understanding is correct. But this approach also loses the multimodality of ego trajectory, could it get a better result than the one with the best score in level k

MCZhi commented 9 months ago

Hi, @fshi2006, thank you for your question. Yes, your understanding is absolutely reasonable. Another important consideration is that the probabilities predicted by the model may be not very accurate, so to ensure the planning performance, we directly train another head to output a single ego trajectory. My guess is that introducing multimodality is not guaranteed to yield better results, but you can definitely try that.

fshi2006 commented 9 months ago

Thanks a lot! I also listened to your tech share yesterday and may I have two more questions to ask?

Have you compared game formers with DTPP, which would achieve better performance?
And in your game transformer experiment, the performance of CQL was very poor. What is the reason for offline rl based solution's poor performance in your opinion? Is it because it performs poorly in unfamiliar scenarios? What if there is enough data? Looking forward to your answer, thx!

MCZhi commented 9 months ago

Hi, @fshi2006, thank you for coming to the talk! Here are the responses.

Regarding the performance levels, both approaches are comparably effective. DTPP would perform slightly better.
The primary challenge with offline RL in autonomous driving tasks is that there is no collision in real-world driving datasets, which makes Q-learning ineffective. Additionally, accurately annotating rewards within these datasets is almost an impossible task. Therefore, I think merely increasing the dataset size is unlikely to resolve these fundamental issues.

StephenGordan commented 8 months ago

DTPP will be open source? Or has TPP been open sourced?