yitu-opensource / T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Other
1.14k stars 177 forks source link

Reproduce failure, test accuracy is 0 #15

Closed valuefish closed 3 years ago

valuefish commented 3 years ago

I followed the instructions to train the network, but the test accuracy is 0.

image

Train command:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /home/datasets/imagenet2012 --model T2t_vit_t_14 -b 64 --lr 5e-4 --weight-decay .05 --img-size 224

Train args log file:

aa: rand-m9-mstd0.5-inc1
amp: true
apex_amp: false
aug_splits: 0
batch_size: 64
bn_eps: null
bn_momentum: null
bn_tf: false
channels_last: false
clip_grad: null
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: null
cutmix: 1.0
cutmix_minmax: null
data: /home/datasets/imagenet2012
decay_epochs: 30
decay_rate: 0.1
dist_bn: ''
drop: 0.1
drop_block: null
drop_connect: null
drop_path: 0.1
epochs: 300
eval_checkpoint: ''
eval_metric: top1
gp: null
hflip: 0.5
img_size: 224
initial_checkpoint: ''
interpolation: ''
jsd: false
local_rank: 0
log_interval: 50
lr: 0.0005
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_noise: null
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 1.0e-05
mixup: 0.2
mixup_mode: batch
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: T2t_vit_t_14
model_ema: true
model_ema_decay: 0.99996
model_ema_force_cpu: false
momentum: 0.9
native_amp: false
no_aug: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
num_gpu: 1
opt: adamw
opt_betas: null
opt_eps: null
output: ''
patience_epochs: 10
pin_mem: false
pretrained: false
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.0
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: cosine
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
train_interpolation: random
tta: 0
use_multi_epochs_loader: false
validation_batch_size_multiplier: 1
vflip: 0.0
warmup_epochs: 3
warmup_lr: 1.0e-06
weight_decay: 0.05
workers: 8

Train log file:

epoch,train_loss,eval_loss,eval_top1,eval_top5
0,6.93210139641395,7.17947375,0.0,0.0
1,6.587943682303796,7.1799,0.0,0.0
2,6.158858546843896,7.23952125,0.0,0.0
3,5.799166578512925,7.31812625,0.0,0.0
4,5.478037082231962,7.295315,0.0,0.0
5,5.229772796997657,7.26689125,0.0,0.0
6,5.017756177828862,7.25214625,0.0,0.0
7,4.8663020592469435,7.29242125,0.0,0.0
8,4.718276950029226,7.33758875,0.0,0.0
9,4.618447037843557,7.43995125,0.0,0.0
10,4.531003245940576,7.5031975,0.0,0.0
11,4.415754194443043,7.5549925,0.0,0.0
12,4.344566400234516,7.64357,0.0,0.0
13,4.289363370491908,7.7813825,0.004,0.018
14,4.23766061434379,7.9322725,0.022,0.044
15,4.19633472424287,8.1038,0.044,0.082
16,4.12356809927867,8.2924725,0.062,0.106
17,4.112422979795015,8.47728,0.072,0.1299999999809265
18,4.05804208608774,8.6492625,0.08,0.16799999998092652
19,4.024759998688331,8.8037,0.088,0.1799999999809265
20,3.996352044435648,8.9579775,0.088,0.2119999999809265
21,3.97814240363928,9.0950575,0.09,0.25199999998092654
22,3.953244062570425,9.21374,0.086,0.27599999996185304
23,3.923392813939315,9.3194625,0.09,0.29399999996185305
24,3.8962308076711802,9.4055875,0.096,0.309999999961853
25,3.8826298530285177,9.4799275,0.094,0.321999999961853
26,3.857419889706832,9.5388525,0.096,0.317999999961853
27,3.8397301664719214,9.59234,0.098,0.33399999996185303
28,3.830301734117361,9.65348,0.1,0.33799999996185304
29,3.794567149419051,9.6977675,0.1,0.35199999996185305
30,3.7705691502644467,9.7288575,0.102,0.35799999996185305
31,3.7818640012007494,9.75144,0.102,0.35399999996185305
32,3.737819240643428,9.7711175,0.102,0.359999999961853
33,3.7500776923619785,9.7921825,0.098,0.359999999961853
34,3.7389543698384213,9.8204725,0.098,0.35799999996185305
35,3.7337539196014404,9.8351375,0.098,0.35199999996185305
36,3.71558638719412,9.8494025,0.098,0.34799999996185305
37,3.7017313562906704,9.8680675,0.096,0.35399999996185305
38,3.6933420575582065,9.887425,0.096,0.35599999996185305
39,3.685996890068054,9.90993,0.094,0.35199999996185305
40,3.6722696148432217,9.927635,0.092,0.34599999996185304
41,3.6679854576404276,9.9460075,0.092,0.33999999996185304
42,3.6521900250361514,9.964325,0.092,0.33199999996185303
43,3.6357782987447886,9.97667,0.092,0.325999999961853
44,3.6442664632430444,9.9876025,0.092,0.323999999961853
45,3.645428818005782,9.988935,0.092,0.32799999996185303
46,3.636759331593147,9.9957075,0.092,0.323999999961853
47,3.621595900792342,10.00688,0.092,0.321999999961853
48,3.6196213227051954,10.014905,0.09,0.317999999961853
49,3.6035012281858005,10.0267175,0.09,0.315999999961853
50,3.58397009739509,10.0337025,0.09,0.313999999961853
51,3.6020636650232167,10.038715,0.09,0.309999999961853
52,3.5740352272987366,10.0447675,0.09,0.305999999961853
53,3.5685019218004665,10.04946,0.09,0.303999999961853
54,3.5746707641161404,10.05026,0.09,0.301999999961853
55,3.5977427867742686,10.0567125,0.09,0.311999999961853
56,3.567492411686824,10.0615125,0.09,0.309999999961853
57,3.5403446509287906,10.061725,0.09,0.301999999961853
58,3.5217277132547817,10.0573,0.09,0.307999999961853
59,3.546560677198263,10.047595,0.09,0.311999999961853
60,3.5381715847895694,10.0411425,0.09,0.315999999961853
61,3.5360234746566186,10.0420225,0.09,0.321999999961853
62,3.540009801204388,10.03893,0.09,0.313999999961853
63,3.533318418722886,10.0283175,0.09,0.319999999961853
64,3.515296518802643,10.0103475,0.09,0.32999999996185303
65,3.5018028571055484,9.995815,0.09,0.33799999996185304
66,3.5014534088281484,9.98507,0.09,0.33999999996185304
67,3.5017168796979465,9.98187,0.09,0.33799999996185304
68,3.4972680440315833,9.97267,0.09,0.33599999996185304
69,3.49407067665687,9.97171,0.09,0.33399999996185303
70,3.5024705804311314,9.9653125,0.09,0.33799999996185304
71,3.487402943464426,9.95022,0.09,0.34599999996185304
72,3.4933561086654663,9.9368625,0.09,0.34799999996185305
73,3.470256122258993,9.92353,0.092,0.34799999996185305
74,3.4872275636746335,9.90852,0.092,0.35399999996185305
75,3.4731622613393345,9.90412,0.092,0.3479999999809265
76,3.4667081787036014,9.894335,0.092,0.3439999999809265
77,3.4779395231833825,9.8808425,0.092,0.3439999999809265
78,3.4699883506848264,9.87127,0.092,0.3319999999809265
79,3.449382346410018,9.857405,0.092,0.3319999999809265
80,3.469204169053298,9.8485525,0.092,0.3339999999809265
81,3.457207968601814,9.8324475,0.092,0.3359999999809265
82,3.4268045196166406,9.8140225,0.092,0.3299999999809265
83,3.4556516729868374,9.8081575,0.094,0.3359999999809265
84,3.4111033586355357,9.7960525,0.094,0.3359999999809265
85,3.4412546983131995,9.79248,0.094,0.33599999996185304
86,3.4455934212757993,9.7842675,0.094,0.3339999999809265
87,3.426803795190958,9.773575,0.094,0.3359999999809265
88,3.412861269253951,9.76771,0.094,0.3399999999809265
89,3.4229128681696377,9.7624575,0.094,0.3359999999809265
90,3.4238921403884888,9.7549125,0.094,0.3359999999809265
91,3.405551341863779,9.752085,0.094,0.3339999999809265
92,3.4058274076535153,9.750085,0.094,0.3379999999809265
93,3.411394944557777,9.7396325,0.092,0.3399999999809265
94,3.4062730303177466,9.7335,0.092,0.3419999999809265
95,3.371101186825679,9.7232875,0.092,0.3459999999809265
96,3.398221717430995,9.7187275,0.092,0.3399999999809265
97,3.3858281694925747,9.714435,0.09,0.3359999999809265
98,3.4034374402119565,9.7098225,0.09,0.3499999999809265
99,3.367879945498246,9.70577,0.09,0.3379999999809265
100,3.3765310690953183,9.70129,0.09,0.3339999999809265
101,3.3754292084620547,9.6929175,0.09,0.3299999999809265
102,3.3744607155139628,9.687665,0.09,0.3299999999809265
103,3.3805779218673706,9.685185,0.09,0.3319999999809265
104,3.379420335476215,9.6812125,0.09,0.3299999999809265
105,3.3533493280410767,9.67052,0.09,0.3299999999809265
106,3.3520270631863522,9.6659875,0.09,0.3319999999809265
107,3.356009464997512,9.6631875,0.09,0.3259999999809265
108,3.372372966546279,9.659135,0.09,0.3239999999809265
109,3.3472719421753516,9.6631075,0.09,0.3399999999809265
110,3.343109314258282,9.6603875,0.092,0.3419999999809265
111,3.3553348917227526,9.652095,0.092,0.3479999999809265
112,3.349419667170598,9.6462275,0.092,0.3439999999809265
113,3.3510518670082092,9.6429475,0.092,0.3439999999809265
114,3.3360960070903483,9.6365475,0.092,0.3459999999809265
115,3.3287810316452613,9.6298025,0.092,0.3419999999809265
116,3.338028444693639,9.6300675,0.092,0.3419999999809265
117,3.3271143344732432,9.6374,0.092,0.3439999999809265
118,3.313729322873629,9.63564,0.092,0.3479999999809265
119,3.330068060984978,9.632895,0.092,0.3519999999809265
120,3.3146992784280043,9.6276425,0.094,0.3579999999809265
121,3.323707924439357,9.62327,0.094,0.3499999999809265
122,3.3184149173589854,9.61767,0.094,0.3519999999809265
123,3.300702750682831,9.61535,0.094,0.3519999999809265
124,3.302518042234274,9.6146575,0.094,0.3459999999809265
125,3.310271075138679,9.61567,0.094,0.3359999999809265
126,3.300638061303359,9.61479,0.094,0.3379999999809265
127,3.297358696277325,9.6194825,0.094,0.3339999999809265
128,3.2771176329025855,9.6230825,0.092,0.3359999999809265
129,3.2591812839874854,9.6237225,0.092,0.3379999999809265
130,3.2713217322642985,9.624335,0.092,0.3459999999809265
131,3.2877807800586405,9.6238,0.092,0.3439999999809265
132,3.273715780331538,9.62516,0.092,0.3359999999809265
133,3.2607854329622707,9.6249725,0.092,0.3279999999809265
134,3.2733991971382728,9.62388,0.092,0.3259999999809265
135,3.2751850119003882,9.61748,0.092,0.3259999999809265
136,3.2595171332359314,9.612015,0.092,0.3259999999809265
137,3.2584590315818787,9.607375,0.092,0.3339999999809265
138,3.247004366838015,9.6043625,0.092,0.3319999999809265
139,3.2697606957875767,9.60127,0.092,0.3359999999809265
140,3.26054663841541,9.60151,0.092,0.3339999999809265
141,3.257813416994535,9.603375,0.092,0.3319999999809265
142,3.223457836187803,9.601535,0.092,0.3359999999809265
143,3.2234387397766113,9.6012425,0.092,0.3319999999809265
144,3.2414122361403246,9.5996425,0.092,0.3359999999809265
145,3.2428410740999074,9.59687,0.092,0.3319999999809265
146,3.223197739857894,9.59311,0.092,0.3299999999809265
147,3.2235814883158755,9.5896975,0.092,0.3319999999809265
148,3.244000948392428,9.586845,0.092,0.3379999999809265
149,3.2083670863738427,9.5836975,0.092,0.3339999999809265
150,3.1991160420271068,9.581405,0.092,0.3239999999809265
151,3.2165578328646145,9.581725,0.092,0.3279999999809265
152,3.1998400413073025,9.585645,0.092,0.3339999999809265
153,3.211859400455768,9.585325,0.094,0.3279999999809265
154,3.192562153706184,9.5832725,0.094,0.3439999999809265
155,3.1891639140936046,9.5863925,0.094,0.3439999999809265
156,3.197993576526642,9.5849,0.094,0.3399999999809265
157,3.186140225483821,9.588125,0.094,0.3419999999809265
158,3.1748920403994045,9.5853,0.094,0.3439999999809265
159,3.185867910201733,9.58202,0.094,0.3419999999809265
160,3.168504536151886,9.57818,0.094,0.3419999999809265
161,3.1859759459128747,9.57362,0.094,0.3439999999809265
162,3.1748691568007836,9.568075,0.096,0.3439999999809265
163,3.1551662591787486,9.566075,0.096,0.3439999999809265
164,3.1625287532806396,9.56317,0.094,0.3499999999809265
165,3.1615952207491946,9.56149,0.094,0.35599999996185305
166,3.1478181939858656,9.5531975,0.094,0.35199999996185305
167,3.1507090605222263,9.5494375,0.092,0.359999999961853
168,3.1515410038141103,9.5486375,0.092,0.361999999961853
169,3.1558457108644338,9.549385,0.094,0.359999999961853
170,3.1417548335515537,9.5474925,0.096,0.359999999961853
171,3.140062309228457,9.5471725,0.096,0.367999999961853
172,3.1301231338427615,9.54592,0.096,0.365999999961853
173,3.1230615010628333,9.53952,0.096,0.363999999961853
174,3.1334514159422655,9.5347475,0.096,0.363999999961853
175,3.1186371858303366,9.528215,0.094,0.373999999961853
176,3.134975951451522,9.523575,0.094,0.375999999961853
177,3.119438565694369,9.519575,0.094,0.377999999961853
178,3.107108019865476,9.5097625,0.096,0.381999999961853
179,3.1197330951690674,9.50155,0.096,0.377999999961853
180,3.0834620732527513,9.50027,0.096,0.377999999961853
181,3.1128392081994276,9.5044825,0.096,0.369999999961853
182,3.1128158752734842,9.5014425,0.096,0.363999999961853
183,3.096724904500521,9.5012025,0.096,0.365999999961853
184,3.085123387666849,9.5043225,0.096,0.35799999996185305
185,3.057559072971344,9.5053625,0.096,0.359999999961853
186,3.0812187103124766,9.5026425,0.096,0.359999999961853
187,3.0887141777918887,9.5026425,0.096,0.35799999996185305
188,3.0874110276882467,9.5005625,0.096,0.35799999996185305
189,3.0806986689567566,9.501095,0.094,0.359999999961853
190,3.076972397474142,9.503175,0.094,0.361999999961853
191,3.075471382874709,9.505015,0.094,0.35599999996185305
192,3.065813816510714,9.507575,0.094,0.361999999961853
193,3.046847811112037,9.501975,0.094,0.35799999996185305
194,3.052790430875925,9.4988825,0.094,0.359999999961853
195,3.0580251400287333,9.4916025,0.094,0.35799999996185305
196,3.025182361786182,9.48683,0.094,0.35799999996185305
197,3.046517706834353,9.4841375,0.094,0.35599999996185305
198,3.03582515166356,9.4863775,0.094,0.359999999961853
199,3.026519546141991,9.4845375,0.094,0.359999999961853
200,3.04235217662958,9.476325,0.094,0.361999999961853
201,3.0420833505116978,9.473685,0.094,0.35799999996185305
202,3.0199934244155884,9.474005,0.094,0.35199999996185305
203,3.0204014778137207,9.4687525,0.094,0.35199999996185305
204,3.0206079391332774,9.4649125,0.094,0.35199999996185305
205,3.0140575078817515,9.4622725,0.094,0.35399999996185305
206,2.9990465549322276,9.45806,0.094,0.361999999961853
207,2.9866109582094045,9.45406,0.094,0.361999999961853
208,3.0094408026108375,9.450755,0.094,0.35199999996185305
209,2.994456593806927,9.447235,0.094,0.35399999996185305
210,2.9664038786521325,9.445155,0.094,0.35399999996185305
211,2.990282173340137,9.444435,0.094,0.35799999996185305
212,2.9661563955820522,9.441715,0.094,0.35599999996185305
213,2.9725900338246274,9.4405425,0.094,0.365999999961853
214,2.969868710407844,9.4423825,0.094,0.367999999961853
215,2.981239465566782,9.446195,0.094,0.369999999961853
216,2.969034731388092,9.4487275,0.094,0.373999999961853
217,2.963566060249622,9.449795,0.094,0.36799999998092653
218,2.961514738889841,9.450675,0.094,0.3639999999809265
219,2.954156669286581,9.446835,0.094,0.36799999998092653
220,2.9516207438248854,9.448195,0.094,0.37399999998092653
221,2.934675565132728,9.4455275,0.094,0.379999999961853
222,2.9340533247360816,9.444435,0.094,0.377999999961853
223,2.939145422898806,9.442995,0.094,0.379999999961853
224,2.9211683181615977,9.440115,0.094,0.385999999961853
225,2.933304020991692,9.440195,0.094,0.39199999996185303
226,2.939405537568606,9.442195,0.094,0.383999999961853
227,2.9224969240335317,9.441955,0.094,0.3839999999809265
228,2.9054433153225827,9.4400875,0.094,0.3899999999809265
229,2.9118012144015384,9.4400875,0.094,0.3879999999809265
230,2.9079172840485206,9.439715,0.094,0.3859999999809265
231,2.907754673407628,9.436835,0.094,0.3919999999809265
232,2.9040455543077908,9.4315825,0.094,0.3959999999809265
233,2.9061117447339573,9.430435,0.094,0.3899999999809265
234,2.907056900171133,9.428515,0.094,0.3919999999809265
235,2.896571315251864,9.4279025,0.094,0.3919999999809265
236,2.8729285231003394,9.427475,0.094,0.3939999999809265
237,2.8715998301139245,9.42657,0.094,0.3959999999809265
238,2.885793530024015,9.42785,0.094,0.3939999999809265
239,2.8637887881352353,9.42777,0.094,0.3939999999809265
240,2.8815112572449904,9.42737,0.094,0.3859999999809265
241,2.862918679530804,9.42713,0.094,0.3879999999809265
242,2.859494145099933,9.42569,0.094,0.3879999999809265
243,2.85544406909209,9.421345,0.094,0.37799999998092654
244,2.852799190924718,9.418385,0.094,0.37399999998092653
245,2.850201345407046,9.414145,0.094,0.37399999998092653
246,2.853095673597776,9.4085725,0.094,0.37399999998092653
247,2.8446288659022403,9.4040125,0.094,0.3839999999809265
248,2.849750495873965,9.4003325,0.094,0.37399999998092653
249,2.8341719187222996,9.4026525,0.094,0.37599999998092654
250,2.8369316321152906,9.3991325,0.094,0.37599999998092654
251,2.8286971266453085,9.3974,0.094,0.37599999998092654
252,2.81843749834941,9.39332,0.094,0.36799999998092653
253,2.829844736135923,9.39268,0.094,0.36999999998092653
254,2.8351017511807957,9.39284,0.094,0.37199999998092653
255,2.8168522073672366,9.391295,0.094,0.36999999998092653
256,2.805282148031088,9.390895,0.094,0.36999999998092653
257,2.818808670227344,9.390815,0.094,0.37599999998092654
258,2.818044153543619,9.389935,0.094,0.37599999998092654
259,2.79145930822079,9.387935,0.094,0.36999999998092653
260,2.820599363400386,9.388655,0.094,0.3619999999809265
261,2.802613152907445,9.390735,0.094,0.3599999999809265
262,2.7996134803845334,9.3886275,0.094,0.3579999999809265
263,2.7997516164412866,9.3866275,0.094,0.3639999999809265
264,2.8026252480653615,9.3857475,0.094,0.3659999999809265
265,2.791010402716123,9.383615,0.094,0.3619999999809265
266,2.798398059148055,9.3822275,0.096,0.3559999999809265
267,2.770832135127141,9.378495,0.094,0.3619999999809265
268,2.7840563288101783,9.377935,0.094,0.3659999999809265
269,2.777487255059756,9.379455,0.094,0.3619999999809265
270,2.7976032495498657,9.379775,0.096,0.3619999999809265
271,2.800186336040497,9.377455,0.096,0.36999999998092653
272,2.762490983192737,9.37463,0.094,0.3659999999809265
273,2.7687534323105445,9.37127,0.094,0.3639999999809265
274,2.765368938446045,9.37327,0.094,0.3659999999809265
275,2.7600090228594265,9.3730825,0.094,0.36999999998092653
276,2.770041300700261,9.3715625,0.094,0.3639999999809265
277,2.7655663398595958,9.3714025,0.094,0.3639999999809265
278,2.7605401323391843,9.3692425,0.094,0.3639999999809265
279,2.767053810449747,9.36895,0.094,0.3579999999809265
280,2.7498939908467808,9.36879,0.094,0.3539999999809265
281,2.7527319605533895,9.36575,0.094,0.3579999999809265
282,2.751628706088433,9.36383,0.094,0.3579999999809265
283,2.7599301796693068,9.36119,0.094,0.3559999999809265
284,2.749599676865798,9.3618575,0.094,0.3519999999809265
285,2.7477452204777646,9.3615375,0.094,0.3519999999809265
286,2.7430056654489956,9.35999,0.094,0.3519999999809265
287,2.7681229389630833,9.3584975,0.094,0.3499999999809265
288,2.734010943999657,9.354765,0.094,0.3519999999809265
289,2.7406882597849918,9.3535375,0.094,0.3519999999809265
290,2.7367874888273387,9.350365,0.094,0.3579999999809265
291,2.7361806768637438,9.3460725,0.094,0.3539999999809265
292,2.7496441327608547,9.34146,0.094,0.3559999999809265
293,2.7410787160579977,9.33562,0.094,0.3559999999809265
294,2.732571922815763,9.3297275,0.094,0.3619999999809265
295,2.7549956028278055,9.3277275,0.096,0.3599999999809265
296,2.7410981013224673,9.3270875,0.096,0.3599999999809265
297,2.73639976978302,9.326635,0.096,0.3599999999809265
298,2.756814732001378,9.325275,0.096,0.3579999999809265
299,2.7238426758692813,9.324635,0.096,0.3619999999809265
300,2.7336285297687235,9.322475,0.096,0.3639999999809265
301,2.736770272254944,9.321115,0.096,0.3639999999809265
302,2.7286048852480373,9.3187425,0.096,0.3659999999809265
303,2.730357348918915,9.31645,0.096,0.37199999998092653
304,2.748807136829083,9.31309,0.094,0.37199999998092653
305,2.7522118871028605,9.3104775,0.094,0.37399999998092653
306,2.7455534751598654,9.3084775,0.094,0.37199999998092653
307,2.7293060834591207,9.3073575,0.094,0.37199999998092653
308,2.7273096350523143,9.3064775,0.094,0.36999999998092653
309,2.733037022443918,9.3070375,0.094,0.37399999998092653
yuanli2333 commented 3 years ago

It seems that your training loss decreased normally but your testing loss not, so I assume that the problem is in your validation data or 'dataloader', or 'preprocess' of validation data. You should process your validation data as shown in official example of Pytorch on ImageNet, in this repo.

image

valuefish commented 3 years ago

@yuanli2333 Thanks very much! The assumption is correct. I used the validation data without preprocessing. The issue has been solved:

image