issue of cifartest - Githubissues

thinkInJava33 commented 4 years ago

There are some problems when i train both IGCV3 and your SeesawNet using your code of model and checkpoint. IGCV3:i was able to resume from the checkpoint you provided but the result is totally wrong.did you change some setting of traning on cifar100. SeesawNet:i can not train your SeesawNet on cifar100(RuntimeError:Given input size: (1280x2x2). Calculated output size: (1280x0x0). Output size is too small),i think maybe the code you provided is not the correct version.

cvtower commented 4 years ago

hi @thinkInJava33 , the model configs for imagenet and cifar datasets are different, and I get correct config from authors of IGCV3. I just upload model file and pre trained model for cifar yesterday, and I will upload training code for cifar later if needed.

thinkInJava33 commented 4 years ago

@cvtower That will help a lot, i trained IGCV3 model using the code you provided and still can not reproduce the result on Cifar-100,only get a acc under 77%.

cvtower commented 4 years ago

@cvtower That will help a lot, i trained IGCV3 model using the code you provided and still can not reproduce the result on Cifar-100,only get a acc under 77%.

hi @thinkInJava33 ,

Sorry for the dealyed relay. I just download the source version from https://github.com/xxradon/IGCV3-pytorch, and compared with my implement version. Well, i found some differences. My opinion is to reproduce the result of igcv3 according to the original paper with least modifications, so i just change @xxradon's code:

line 22 in cifar100data.py:

transforms.ColorJitter(0.3, 0.3, 0.3),

seems like that xxrandon add color jitter to cifar100 during training, but i think this should not be enabled since the auther did not mention this within the paper, then i remove this opration.
line 57 in igcv3.py:

nn.ReLU6(inplace=True),#remove according to paper

below model related differences is what i got the correct configuration for cifar dataset from authers of igcv3:

line 85 in igcv3.py: elif args.downsampling == 4:(not 2 from xxradon's repo)
line 105 in igcv3.py:
setting of inverted residual blocks
```
self.interverted_residual_setting = [
    # t, c, n, s
    [1, 16, 1, 1],
    [6, 24, 4, s2],
    [6, 32, 6, 2],
    [6, 64, 8, 2],
    [6, 96, 6, 1],
    [6, 160, 6, 1],
    [6, 320, 1, 1],
]
```
and the corresponding setting from @xxradon's repo is wrong like:

setting of inverted residual blocks
```
self.interverted_residual_setting = [
    # t, c, n, s
    [1, 16, 1, 1],
    [6, 24, 4, s2],
    [6, 32, 6, 2],
    [6, 64, 8, 2],
    [6, 96, 6, 1],
    [6, 160, 6, 2],
    [6, 320, 1, 1],
]
```
I guess you will be able to reproduce the result from the original papers with this. if i remember correctly, i have this issue just when @xxradon paste the repo, and i just made modification according to the original paper and the authers. I will upload whole training code if needed, but my i just rename all files and models in my private versions since i just use igcv3 as a reference model.

cvtower commented 4 years ago

https://github.com/cvtower/SeesawNet_pytorch/tree/master/cifar_test/baseline_version please use this version and ignore other config files of shufflenetv2/mnasnet/half versions(since i just trained all available model hundreds of times on cifar before imagenet training)

cvtower / SeesawNet_pytorch

issue of cifartest #2

transforms.ColorJitter(0.3, 0.3, 0.3),

nn.ReLU6(inplace=True),#remove according to paper

setting of inverted residual blocks

setting of inverted residual blocks