Closed wforange closed 7 months ago
Thanks for your interest!
Thank you for your answer. I didn't modify the batch size, the default setting is 256. Can you answer about the accuracy?
Why do you change the lr to 6e-3? What's the accuracy if keeping the lr to the default value?
If I use default setting I got below log, the acc1 increased too slow. By compared to you provided, I guess the lr should be 6e-3. but still can't repeat your result.
{"train_lr": 1.000000000000068e-06, "train_loss": 6.975921413690733, "test_loss": 6.928075437327378, "test_acc1": 0.14400001098632811, "test_acc5": 0.5980000273132324, "epoch": 0, "n_parameters": 5489328} {"train_lr": 1.000000000000068e-06, "train_loss": 6.962617924649843, "test_loss": 6.91879555287252, "test_acc1": 0.1540000099182129, "test_acc5": 0.7040000312805176, "epoch": 1, "n_parameters": 5489328} {"train_lr": 0.00010080000000000871, "train_loss": 6.845906076576117, "test_loss": 6.599413409487892, "test_acc1": 2.626000080413818, "test_acc5": 9.092000283966064, "epoch": 2, "n_parameters": 5489328} {"train_lr": 0.00020059999999999599, "train_loss": 6.619759896676317, "test_loss": 5.937542263788122, "test_acc1": 7.484000213470459, "test_acc5": 21.026000695800782, "epoch": 3, "n_parameters": 5489328} {"train_lr": 0.0003003999999999828, "train_loss": 6.369503548677019, "test_loss": 5.4622787956063075, "test_acc1": 12.124000384216309, "test_acc5": 30.72400082397461, "epoch": 4, "n_parameters": 5489328} {"train_lr": 0.00040020000000002843, "train_loss": 6.231803081971374, "test_loss": 5.275543744327458, "test_acc1": 13.990000392456055, "test_acc5": 34.42800095703125, "epoch": 5, "n_parameters": 5489328} {"train_lr": 0.0004996642360148386, "train_loss": 6.2141018290218595, "test_loss": 5.2779140254013415, "test_acc1": 14.554000370788573, "test_acc5": 34.336000876464844, "epoch": 6, "n_parameters": 5489328} {"train_lr": 0.0004995165484649184, "train_loss": 6.166435195006532, "test_loss": 5.2440667662001745, "test_acc1": 15.882000408935546, "test_acc5": 35.872000935058594, "epoch": 7, "n_parameters": 5489328} {"train_lr": 0.0004993420469200044, "train_loss": 6.121721277324607, "test_loss": 5.166219835062973, "test_acc1": 17.77600058227539, "test_acc5": 38.030001118164066, "epoch": 8, "n_parameters": 5489328} {"train_lr": 0.0004991407505161498, "train_loss": 6.071527244566346, "test_loss": 5.061290801026439, "test_acc1": 19.700000533447266, "test_acc5": 39.95200100585937, "epoch": 9, "n_parameters": 5489328} {"train_lr": 0.0004989126813277368, "train_loss": 5.998429657076951, "test_loss": 4.927277659641877, "test_acc1": 21.396000593261718, "test_acc5": 41.930001088867186, "epoch": 10, "n_parameters": 5489328} {"train_lr": 0.0004986578643652291, "train_loss": 5.921012676686501, "test_loss": 4.805169545967161, "test_acc1": 22.998000694580078, "test_acc5": 43.72200132080078, "epoch": 11, "n_parameters": 5489328} {"train_lr": 0.0004983763275721029, "train_loss": 5.853875551006491, "test_loss": 4.682203596784868, "test_acc1": 24.362000673217775, "test_acc5": 45.34600124023437, "epoch": 12, "n_parameters": 5489328} {"train_lr": 0.0004980681018220224, "train_loss": 5.783869908391524, "test_loss": 4.5477353980523025, "test_acc1": 26.17600067504883, "test_acc5": 47.71600116210937, "epoch": 13, "n_parameters": 5489328} {"train_lr": 0.0004977332209154644, "train_loss": 5.721688020238869, "test_loss": 4.486863227290962, "test_acc1": 27.65400068847656, "test_acc5": 48.98200126953125, "epoch": 14, "n_parameters": 5489328} {"train_lr": 0.0004973717215759342, "train_loss": 5.658488622672266, "test_loss": 4.34144630595928, "test_acc1": 28.79400069213867, "test_acc5": 51.46200160644531, "epoch": 15, "n_parameters": 5489328} {"train_lr": 0.0004969836434458476, "train_loss": 5.614673731805419, "test_loss": 4.251058745930213, "test_acc1": 30.60800080810547, "test_acc5": 52.408001640625, "epoch": 16, "n_parameters": 5489328} {"train_lr": 0.0004965690290822709, "train_loss": 5.553136842904522, "test_loss": 4.1680190090004725, "test_acc1": 31.874000888671876, "test_acc5": 54.086001625976564, "epoch": 17, "n_parameters": 5489328} {"train_lr": 0.0004961279239524144, "train_loss": 5.5050668902248505, "test_loss": 4.131608025718282, "test_acc1": 32.57400111328125, "test_acc5": 54.0420015234375, "epoch": 18, "n_parameters": 5489328} {"train_lr": 0.0004956603764285549, "train_loss": 5.464044562441935, "test_loss": 3.9814893926372963, "test_acc1": 34.338000909423826, "test_acc5": 56.23800159179687, "epoch": 19, "n_parameters": 5489328} {"train_lr": 0.0004951664377823055, "train_loss": 5.417946175991489, "test_loss": 3.9336773275419046, "test_acc1": 34.86800102783203, "test_acc5": 57.3200017578125, "epoch": 20, "n_parameters": 5489328} {"train_lr": 0.0004946461621798025, "train_loss": 5.378094382518582, "test_loss": 3.869185467712752, "test_acc1": 36.356000913085936, "test_acc5": 58.55800174804688, "epoch": 21, "n_parameters": 5489328}
Do you only train the model for 22 and 18 epochs, respectively?
Yes, I the train logs are generated one by one. Because the training takes 1.5 hours per epoch on my environment, so I only train few epochs.
So the lower accuracy is expected, given that fewer epochs are trained.
No, I mean if I compared the accuracy value by each epoch, I should get same result right? For example, by comparing the epoch No. 15, I see difference. I don't think later training can compensate this. mine: {"train_lr": 0.0029639955590981108, "train_loss": 3.918420416702755, "test_loss": 2.0485606557540312, "testacc1": 53.86400150390625,_ "test_acc5": 78.40400258789063, "epoch": 15, "n_parameters": 5489328} your's: {"train_lr": 0.002983962137779637, "train_loss": 3.6223566518556014, "test_loss": 1.7595897195014087, "test_acc1": 59.484000420837404, "test_acc5": 82.38600054382324, "epoch": 15, "n_parameters": 5489328}
Thanks. The reason may lie in the small batch size and we suggest that you could reproduce the results according to the default configuration.
Thank you for your answer, so you thought 8 machines with 8*256 batch size increased the speed of convergence of accuracy?
Yes.
Got it, thank you.
Hi, Thank you for your great work! I have 2 questions about RepVit_0_9.