camera pose coding and change camera intrinsic/extrinsic to generate images

cure-lab / MagicDrive

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”

https://gaoruiyuan.com/magicdrive/

GNU Affero General Public License v3.0

419 stars 22 forks source link

camera pose coding and change camera intrinsic/extrinsic to generate images #21

Closed imbinwang closed 3 months ago

imbinwang commented 3 months ago

Thanks for sharing this work. As the section 4.1 of your paper shows, the camera parameters are encoded by Fourier embedding and a MLP. By this way, could we slightly change camera intrinsic/extrinsic to generate images? I didn't see any results in your paper about camera control.

flymin commented 3 months ago

That’s a good question and currently the controllability of camera pose is not very accurate due to the lack of diverse training data. You can check the discussion here and Appendix J for some exploration.

imbinwang commented 3 months ago

Appendix J shows the camera fov could be edited. Thanks for clarification.

imbinwang commented 3 months ago

Another question: could the car be turned upside down to generate an accident scene by changing the orientation of the control box? The paper demonstrates the car can be rotated from head to tail. The nuscenes dataset also lacks such training cases.

flymin commented 3 months ago

I suppose not, because there are no such cases during training.

imbinwang commented 3 months ago

Get it. Thank you.