sail-sg / volo

VOLO: Vision Outlooker for Visual Recognition
Apache License 2.0
922 stars 94 forks source link

Could you please release an ablation study that compares LV-VIT with / without outlook #9

Closed theFoxofSky closed 3 years ago

theFoxofSky commented 3 years ago

This is a great job that proposes a new attention way. However, I want to figure out its ability when comparing all the things in the same condition.

Could you please release an ablation study that compares the outlook and attention under the same training policy, hyperparameters (network width, depth), and architectures (for example ViT or LV-ViT)?

So that we can better know the effectiveness of the outlook.

yuanli2333 commented 3 years ago

Hi, a simple ablation study is to compare volo_d3↑448 (86M param, 67.9B flops) and LV-ViT-L↑448 (150M param, 157.2B flops). The top-1 acc of volo_d3↑448 is 86.3% and LV-ViT-L↑448 is 86.2%. So with smaller parameters and FLOPs, volo achieve higher performance.

houqb commented 3 years ago

@theFoxofSky You may refer to Table 5 in the paper.

theFoxofSky commented 3 years ago

@theFoxofSky You may refer to Table 5 in the paper.

Thanks, I found it.

yuanli2333 commented 3 years ago

Currently, there is a some small bug in table5, will update it soon~