PointsCoder / ONCE_Benchmark

One Million Scenes for Autonomous Driving
177 stars 32 forks source link

semi-supervised learning with gtaug? #3

Closed AndyYuan96 closed 3 years ago

AndyYuan96 commented 3 years ago

Hi, after reading the semi-supervised config file, I find for all semi-supervised method, you didn't use gt-aug augmentation method for labeled data, did you already do experiment and find gt-aug didn't give improvement in semi-supervised learning, or just for convenient。

bmankirLinker commented 3 years ago

@AndyYuan96 do you mean no augmentation applied to data fed to the teacher? Because the origin/student model looks like they get augmentation on labeled data.

AndyYuan96 commented 3 years ago

@AndyYuan96 do you mean no augmentation applied to data fed to the teacher? Because the origin/student model looks like they get augmentation on labeled data.

just no gt-aug augmentation, they apply other augmentation like rotation. As when data is little, gt-aug augmentation will give relatively high improvement compared with without gt-aug。

bmankirLinker commented 3 years ago

@AndyYuan96 there's also a hint here on what you're asking. Section S3.2 GT Sampling https://arxiv.org/pdf/2103.05346.pdf

We do not adopt the GT sampling data augmentation for all settings for fair comparisons. The reason is that it is unaffordable for the iterative self-training pipeline to use GT sampling data augmentation since it requires frequently generating a new GT database with updated pseudo labels, which produces a large computation cost (leveraging GT sampling for self-training takes more than 3× training time).

AndyYuan96 commented 3 years ago

@AndyYuan96 there's also a hint here on what you're asking. Section S3.2 GT Sampling https://arxiv.org/pdf/2103.05346.pdf

We do not adopt the GT sampling data augmentation for all settings for fair comparisons. The reason is that it is unaffordable for the iterative self-training pipeline to use GT sampling data augmentation since it requires frequently generating a new GT database with updated pseudo labels, which produces a large computation cost (leveraging GT sampling for self-training takes more than 3× training time).

I mean generate gt-aug using only labeled data.

bmankirLinker commented 3 years ago

@AndyYuan96 there's also a hint here on what you're asking. Section S3.2 GT Sampling https://arxiv.org/pdf/2103.05346.pdf

We do not adopt the GT sampling data augmentation for all settings for fair comparisons. The reason is that it is unaffordable for the iterative self-training pipeline to use GT sampling data augmentation since it requires frequently generating a new GT database with updated pseudo labels, which produces a large computation cost (leveraging GT sampling for self-training takes more than 3× training time).

I mean generate gt-aug using only labeled data.

It looks like gt-aug is applied only on the origin/pre-trained model.