Open mrabiabrn opened 1 month ago
Yes, you can refer to the procedure for validation set. They are similar.
Hi, I tried to generate new samples from training data using the provided validation set generation script. However, I realized that for training instances, generations are not diverse and quite similar to the original data (color of the vehicles, shape of the road, background, etc.) This is not the case for validation samples. I can see diverse generations for the same bounding boxes. I added examples from training and validation generation results below. What do you think could be the reason for this?
Validation original vs generated
Training original vs generated
In some cases, it may happen. However, using such data to augment the original training set leads to improvements in downstream tasks.
If it is severe in your case, you can try editing the scene condition to generate different data for augmentation.
This issue is stale because it has been open for 7 days with no activity. If you do not have any follow-ups, the issue will be closed soon.
Training generations are generally like this in my case. Augmenting training data with this doesn't improve CVT performance, it even hurts it. I can try editing the scene and text condition, but to reproduce your results, it would be great if you could share your training and validation set generations so I can identify any discrepancies.
We already released the model weights. You can sample our model and see it.
I cannot share the data. However, I have to admit that our cases on the training set are similar to yours. We did not modify the code for perception models; we only added more generated data as described in our paper. Maybe you can also try bevfusion and see.
This issue is stale because it has been open for 7 days with no activity. If you do not have any follow-ups, the issue will be closed soon.
@flymin Hello author, the road segmentation performance of CVT in Table 1 is 61, which is confirmed to be 59.3 in Table 4. We also reproduced 59.x. How was 61 obtained? Or what is the difference between these two data?
which is confirmed to be 59.3 in Table 4
This is not true. Please also see Figure 7. I think the problem lies in $M={0}$.
This issue is stale because it has been open for 7 days with no activity. If you do not have any follow-ups, the issue will be closed soon.
To confirm, the results in Table 1 were generated with use_zero_map_as_unconditional = True and a guidance scale of 2. Is this correct?
Yes. And sorry for the late reply.
No problem at all, and thanks for clarifying!
Hi,
I noticed the code for validation set generations but didn't find any code for training data augmentation. Should we follow the same procedure for generating training data? Could you provide more details on this or share a link to the generated data for BEV perception? That would be greatly appreciated.