opendilab / LMDrive

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Apache License 2.0
634 stars 52 forks source link

The problem of model feature extraction from decoder #62

Open pisces365 opened 3 months ago

pisces365 commented 3 months ago

Hello, thank you for your excellent work. In memfuser.py, how are 2500 and 2501:2506 determined in the features queried from decoder?

        traffic_feature = hs[:, :2500]
        traffic_light_state_feature = hs[:, 2500]
        stop_sign_feature = hs[:, 2500]
        waypoints_feature = hs[:, 2501:2506]

Thanks!

deepcs233 commented 3 months ago

Hi!

Sorry for the late reply. I've been very busy lately.

The five features represent the query tokens for the next five waypoints. Waypoints are the coordinates that the autonomous vehicle must reach at specific times.

pisces365 commented 2 months ago

Thank you for your reply! By the way, I have another question. How was the number 2500 determined and where is the detailed explanation? @deepcs233

deepcs233 commented 2 months ago

Hi!

The number 2500 comes from the 50 $\times$ 50 BEV feature map size.