Open KevenLee opened 2 months ago
There could be many reasons. For example, one of them could be limitations from the perception model’s architecture (but we are not sure).
If severed as evaluations, comparisons with controlled settings should always be convincing.
We have to admit that our exploration is very preliminary but shows the value to synthetic data. More detailed research about how to magnify the utility of synthetic data is welcomed.
Thank you very much for your prompt response. I have been exploring methods to augment autonomous driving data, such as domain adaptation using image processing algorithms, reconstruction and editing with NeRF,3D Gaussian. And I have been particularly intrigued by the recent advancements in large model generation methods, such as the impressive MagicDrive. However, I have not yet found sufficient evidence to demonstrate that such data augmentation methods can bring satisfactory value to practical applications. When further increasing the amount of real data (for example, growing from 1 million to 30 million) and the training epochs, how should we assess the benefits of synthesis methods? In other words, can synthesis methods be further applied to the actual industrial fields?
The logic behind using generation models would be similar to using other synthesis/augmentation methods. The major difference lies in synthetic quality, where generation models can achieve high quality with low cost. Besides, our controllable generation produce annotated data for downstream tasks. More features for corner case generation can be found in our paper.
This issue is stale because it has been open for 7 days with no activity. If you do not have any follow-ups, the issue will be closed soon.
I have a question regarding Table 6 in the paper of the magicDrive v7 version. If the training period is adopted at 2x, the difference in the cam-only results between using and not using synthetic data is very small. Does this mean that if there is a 3x, 4x, then the significance of using synthetic data is not very large (the improvement in mAP, NDS is very small). In this case, how to evaluate the effectiveness of synthetic data?