aeo123 commented 2 years ago

I'm having a issue on reproducing the Comatch STL10 results, which are much better than 80%. Here is my training log with fold1 labeled indices. Could please share your log or 5folds results? 2022-09-2710%3A50%3A52-1.log

KaiWU5 commented 1 year ago

Sorry I cannot share my log because there are some privacy issues, below are last lines of my log. Other than refactoring my code, I nearly did nothing. It's an interesting phenomenon also mentioned in #6 , I will check and reply later.

10/25/2021 20:15:23 - INFO - root - Epoch 511 top-1 acc: 80.09
10/25/2021 20:15:23 - INFO - root - Epoch 511 top-5 acc: 97.89
10/25/2021 20:15:26 - INFO - root - Best top-1 acc: 80.80
10/25/2021 20:15:26 - INFO - root - Mean top-1 acc: 80.05

aeo123 commented 1 year ago

I rechecked my config in the training log and make sure it's exactly the same as the published config in the repo. Also, the labeled set loads only 1k samples as expected and the resnet18 model is not pretrained. It's hard to find out the reason why acc increases much :(

KaiWU5 commented 1 year ago

Sorry for the troubles. I also found my runs of comatch, from seed 1-5 (slightly better than the origin paper with 79.9±0.68, but has a bigger variance, we use origin paper's metric 79.80±0.38 instead) 79.97 80.22 79.64 80.58 79.22

I will check the changes in the refactoring process and post the results later.

aeo123 commented 1 year ago

Thanks much! This really helps with my work.

syorami commented 1 year ago

Is there any progress about this problem? @aeo123 @KaiWU5

KaiWU5 commented 1 year ago

6 #4

I have tried several tests but didn't get results much better than 80% (In @aeo123 epoch40 got top-1 ACC > 84.4). Please check we are using the same training commands and configs.

Config: configs/comatch/comatch_stl10_wres_r18_b1x64_l5.py

GPUS: 1 / single GPU following CoMatch paper

Training command: Command1 python3 train_semi.py --cfg configs/comatch/comatch_stl10_wres_r18_b1x64_l5.py --out YOUR/output/path --seed YOURSeed --gpu-id 0 or Command2 python3 -m torch.distributed.launch --nproc_per_node 1 train_semi.py --cfg configs/comatch/comatch_stl10_wres_r18_b1x64_l5.py --out /YOUR/output/path --use_BN True --seed YOURSeed

Data stl10_binary.tar.gz, Downloaded from Official website, 2640397119 bytes md5sum 91f7769df0f17e558f3565bffb0c7dfb

My Results Command1 & Seed1: In training, current Epoch 147, best top-1: 80.65 Command2 & Seed5: In training, current Epoch 262, best top-1: 80.20 Command2 & Seed1: In training, current Epoch 98, best top-1: 80.01 Command2 & Seed5 & Gpu2 : In training, current Epoch 117, best top-1 77.97

So, my guesses are as above, maybe we are using different training commands, config, num_gpu or data (the worst idea). Please let me know if we are all the same and have different results.

TencentYoutuResearch / Classification-SemiCLS

Cannot reproduce STL10 results #10

6 #4