ImageNet training - Githubissues

jlindsey15 commented 4 years ago

Hello! Thanks for making this code available. I wanted to ask what your experiences / results have been running this algorithm on ImageNet, and how this code may be used to do so (when I try setting ImageNet as the dataset, my GPU runs out of memory -- I can imagine various ways to address this, but I wanted to see if you have encountered a similar issue and if so, how you've addressed it). Thank you!

larspars commented 4 years ago

Hi! We did a little dabbling on ImageNet, but not a lot. On my hardware (somewhat old GPUs, slow disks) training on ImageNet takes a very long time, so it's hard to iterate meaningfully. We were able to get 43.4% top-1 error on the validation set, which is close to AlexNet, but my suspicion is that this may be far from the ceiling. This was using 112x112 images, and splitting up the model over two GPUs (first layers on one GPU, last layers on the other). I don't think I have the code for this in a usable state unfortunately. For the paper we wanted to keep the convnet architecture relatively uniform across all experiments, but I suspect better results could be seen with a different architecture. The activation vectors really blow up in size when using larger images, so I suspect there's room for improvement in handling that in a different way.

jlindsey15 commented 4 years ago

Got it, thanks for the info! One more question, if you don't mind -- did you try the "bio plausible" version on ImageNet as well?

larspars commented 4 years ago

No, only using the predsim loss. BTW, I've edited my previous comment: I originally said 43.4% top-1 accuracy, but actually meant 43.4% top-1 error, in other words ~56.6% top-1 accuracy.

anokland / local-loss

ImageNet training #3