Res2NeXt on Cifar100 - Githubissues

Res2Net / Res2Net-PretrainedModels

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

https://mmcheng.net/res2net/

1.07k stars 215 forks source link

Res2NeXt on Cifar100 #36

Closed qiangwang57 closed 2 years ago

qiangwang57 commented 4 years ago

Hi @gasvn ,

Thanks for the brilliant work!

I have a couple of simple questions regarding Res2NeXt on Cifar100.

The implementation for ImageNet used the block without hierarchical addition for downsampling, but the code you mentioned in other issue threads (https://gist.github.com/gasvn/cd7653ef93fb147be05f1ae4abad6589) used group convolutions as the first block at each stage for downsampling instead. I wonder which one is the correct one?
Did you use batch size 256 or 128 for the training? I saw your init LR was set to 0.05, which was used by ResNeXt for batch size 256.

Best wishes,

Qiang

gasvn commented 4 years ago

What's your reproduced number? The downsampling module has no hierarchical addition. And use group conv or the form in the res2net for imagenet for the dowmsampling module have similar results on cifar100. I use batch-size of 64 and lr=0.05 on cifar100 without tuning.

gasvn commented 4 years ago

Please let me know if you still cannot reproduce our results.

qiangwang57 commented 4 years ago

Thanks @gasvn for the timely response.

I follow the architecture on ImageNet with batch size 128 and lr 0.1 using 4 GPUs. I managed to reproduce the ResNeXt results, but for Res2NeXt, it is only 80.78.

The only difference I found is the mean and std, where yours are mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225] which are for ImageNet. I wonder why you choose them for cifar100?

gasvn commented 4 years ago

I didn't notice this when I was training the Res2NeXt. Maybe you can try to use one gpu with batchsize 64 as I did. From my experience, it should not be hard to reproduce the result. I will send you my code once I found it.

qiangwang57 commented 4 years ago

I didn't notice this when I was training the Res2NeXt. Maybe you can try to use one gpu with batchsize 64 as I did. From my experience, it should not be hard to reproduce the result. I will send you my code once I found it.

Thanks @gasvn , I will try it and get back to you with the results.

qiangwang57 commented 4 years ago

Hi @gasvn ,

I have tried different combinations of downsampling block, batch size, lr, # GPUs, and mean and std, but unfortunately, I did not manage to reproduce the results, even close. The best so far is over 18%.

gasvn commented 4 years ago

Have you manage to reproduce our results?

qiangwang57 commented 4 years ago

Unfortunately, no...

From: Shanghua Gao notifications@github.com Sent: 03 July 2020 02:37 To: Res2Net/Res2Net-PretrainedModels Res2Net-PretrainedModels@noreply.github.com Cc: Chiang chiang.wang@outlook.com; Author author@noreply.github.com Subject: Re: [Res2Net/Res2Net-PretrainedModels] Res2NeXt on Cifar100 (#36)

Have you manage to reproduce our results?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Res2Net/Res2Net-PretrainedModels/issues/36#issuecomment-653286494, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AL44YHP7RM7CLIK6FBTSWATRZUY6FANCNFSM4OFF23BQ.

gasvn commented 4 years ago

I managed to find the code I used for training the res2net on cifar100. It can reproduce the result of Res2NeXt-29, 6c×24w×4s

BestPrec so far@1 83.020 in epoch 273

https://gist.github.com/gasvn/a1793919427f799e74bb7c900af11d4c

qiangwang57 commented 4 years ago

I managed to find the code I used for training the res2net on cifar100. It can reproduce the result of Res2NeXt-29, 6c×24w×4s

BestPrec so far@1 83.020 in epoch 273

https://gist.github.com/gasvn/a1793919427f799e74bb7c900af11d4c

Perfect! Thank you very much! I will let you know the results!

I assume the following parameters you used for the training: batch size: 64 init LR: 0.05 single GPU

Apart from those, anything else I need to pay special attention?

Cheers, Qiang

gasvn commented 4 years ago

Have you manage to reproduce our results? Sorry, there is nothing else that I can help you with.

swjtulinxi commented 4 years ago

请问在stride=2的时候，前后两个尺寸不一样是怎么融合呢，直接相加会不会又问题啊

qiangwang57 commented 4 years ago

Have you manage to reproduce our results? Sorry, there is nothing else that I can help you with.

Unfortunately, I did not manage to reproduce the results, even close ones.

Anyway, really appreciate your help!