opendilab / LMDrive

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Apache License 2.0
526 stars 48 forks source link

About how specifically LMDrive works #55

Open PhilWallace opened 1 month ago

PhilWallace commented 1 month ago

Thanks for the great work!

I am new to LLM-based ADS, and I have some questions about how LMDrive works. As stated in the paper, LMDrive is developed based on Q-former. And how Q-former works is as below:

image

As I took it, the key idea of LMDrive is to use the idea of Q-former for LLM-based driving. Efforts done in the paper is to training the Q-Former, merging multi-sensor data, and so on. Is this understanding correct?

Additionally, what is the specifically output of LMDrive? It seems to output future waypoints instead of direct control signals.

Many thanks for your attention!