cure-lab / MagicDrive

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
https://gaoruiyuan.com/magicdrive/
GNU Affero General Public License v3.0
664 stars 40 forks source link

A doubt about the effectiveness experiment. #87

Open KevenLee opened 2 months ago

KevenLee commented 2 months ago

I have a question regarding Table 6 in the paper of the magicDrive v7 version. If the training period is adopted at 2x, the difference in the cam-only results between using and not using synthetic data is very small. Does this mean that if there is a 3x, 4x, then the significance of using synthetic data is not very large (the improvement in mAP, NDS is very small). In this case, how to evaluate the effectiveness of synthetic data?

flymin commented 2 months ago

There could be many reasons. For example, one of them could be limitations from the perception model’s architecture (but we are not sure).

If severed as evaluations, comparisons with controlled settings should always be convincing.

We have to admit that our exploration is very preliminary but shows the value to synthetic data. More detailed research about how to magnify the utility of synthetic data is welcomed.

KevenLee commented 2 months ago

Thank you very much for your prompt response. I have been exploring methods to augment autonomous driving data, such as domain adaptation using image processing algorithms, reconstruction and editing with NeRF,3D Gaussian. And I have been particularly intrigued by the recent advancements in large model generation methods, such as the impressive MagicDrive. However, I have not yet found sufficient evidence to demonstrate that such data augmentation methods can bring satisfactory value to practical applications. When further increasing the amount of real data (for example, growing from 1 million to 30 million) and the training epochs, how should we assess the benefits of synthesis methods? In other words, can synthesis methods be further applied to the actual industrial fields?

flymin commented 2 months ago

The logic behind using generation models would be similar to using other synthesis/augmentation methods. The major difference lies in synthetic quality, where generation models can achieve high quality with low cost. Besides, our controllable generation produce annotated data for downstream tasks. More features for corner case generation can be found in our paper.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 7 days with no activity. If you do not have any follow-ups, the issue will be closed soon.