ImageNet test - Githubissues

dccho commented 8 years ago

I'm trying to make DenseNet for ImageNet dataset. But, it doesn't converge well. Have you ever try DenseNet to ImageNet dataset? Please share it if you have any successful densenet network for imagenet.

liuzhuang13 commented 8 years ago

We are experimenting with imagenet, right now we successfully trained a model with only 10m params, the top 1 error is 28.7% which is better than resnet-18 （30.4%）which has 11m params. If you want the model I can share it with you later

thanks

dccho commented 8 years ago

@liuzhuang13 Thanks! Hope to see your great densenet model soon.

liuzhuang13 commented 8 years ago

Thanks, probably imagenet results needs a while. Do you want the model definition file for imagenet? Or do you want an actual pretrained model? I can share with you through email. Leave your email!

dccho commented 8 years ago

Thanks~! My email is dccho.cvpr.phd@gmail.com. If pretrained model is too big, you can send me definition only. I'll train from scratch

wlw208dzy commented 7 years ago

@liuzhuang13 I would appreciate it if you can send me the model definition file for ImageNet Dataset. My email is dzy_wlw@163.com. Thanks!

liuzhuang13 commented 7 years ago

@wlw208dzy I'll share the links with you here.

densenet (10M parameters, 28.7% val error) definition: https://1drv.ms/u/s!AjwB4qLCejx-be9Qh7ZT-RtvV38 pretrained model: https://1drv.ms/u/s!AjwB4qLCejx-a17znBzqnquzaJY

densenet (40M parameters, 24.0% val error) definition: https://1drv.ms/u/s!AjwB4qLCejx-bJQcJQi9ptGgbT0 pretrained model: https://1drv.ms/u/s!AjwB4qLCejx-bp0a4WlshgcWrNs

Due to limited resources, these are only preliminary models, we're still investigating different architecture design (e.g., bottleneck structures) for DenseNets.

argman commented 7 years ago

@liuzhuang13 , does densenet(40m parameters) compare to resnet-152 ? from slim, the val error of resnet-152 is about 24.0% And how long does it take to train on imagenet ? Why do you choose Nesterov as optimizer ? Tks!

liuzhuang13 commented 7 years ago

@argman

@liuzhuang13 , does densenet(40m parameters) compare to resnet-152 ?

From this page https://github.com/facebook/fb.resnet.torch/tree/master/pretrained (Facebook's original implementation), resnet-152 has val error 22.16%, which is better than Densenet with 40M parameters. It has 60M parameters though. Note that data augmentation, optimization, etc. are kept the same. The tensorflow implementation may have some differences.

And how long does it take to train on imagenet ?

It took us 10 days to train 40M densenet for 120 epochs on 4 TITAN X GPUs, with batchsize 128

Why do you choose Nesterov as optimizer ?

We followed fb.resnet.torch's implementation for every setting and hyperparameter, except a smaller batchsize (due to memory constraint) and slightly more training epochs.