cure-lab / MagicDrive

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
https://gaoruiyuan.com/magicdrive/
GNU Affero General Public License v3.0
419 stars 22 forks source link

[video-generation-ego-movement] Question about video generation #33

Closed kandrio closed 4 weeks ago

kandrio commented 1 month ago

Good afternoon,

For video generation, the MagicDrive paper states that only the first and the last frames have bounding boxes (section 5.4). I have the following question:

The ego-pose of the car changes in these 7 frames (around 4 seconds duration), but if I understand correctly both the bounding boxes and the camera poses have the ego-point of the car as reference. Therefore, they do not inject any information regarding the change in ego pose of the car from its starting position to the finish. In my mind, this is important information to inject to the video generation model but I may be missing something.

Thank you in advance for your feedback.

flymin commented 1 month ago

You are correct, we do not explicitly control it in our paper. All the information comes from temporal relationships learned from the data.