Reproducing STL-10 Result

haeusser / learning_by_association

This repository contains code for the paper Learning by Association - A versatile semi-supervised training method for neural networks (CVPR 2017) and the follow-up work Associative Domain Adaptation (ICCV 2017).

https://vision.in.tum.de/members/haeusser

Apache License 2.0

151 stars 63 forks source link

Reproducing STL-10 Result #8

Closed mlearning closed 7 years ago

mlearning commented 7 years ago

Hi,

Would you mind sharing your full list of hyper parameters for producing STL-10 data set result as shown in the paper? I was far from reproducing the result after tweaking a few parameters here and there as being used in other data sets.

And one more question: in the paper it was mentioned that you used only 100 images per label. May I know whether there is a particular reason why you don't take advantage of all the labelled images available?

nlml commented 7 years ago

I can answer the second question here: one of the focuses of this paper is to do semi-supervised learning, which deals with the situation where we have a combination of labeled and unlabeled images (usually far fewer labeled than unlabeled).

mlearning commented 7 years ago

Thanks you for answering my 2nd question!

Over the past weekend I was trying both the STL model and the inception v3 on the STL-100 dataset. I tried visit_weight between 0 and 0.25, sup_per_batch from 10 to 100, unsub_batch_size from 100 to 2000, got maximum of 65% of accuracy during evaluation. I used learning rate decay from 1e-4 to 1e-6 every 10,000 steps. Could you point me to a direction on which parameters I should focus on tuning?

haeusser commented 7 years ago

This should converge to 81% test accuracy after 1M iterations (12h): visit_weight = 0.0 augmentation = true

I used 20 worker_replicas and 7 ps_tasks for asynchronous training.

The initial learning rate was 1e-5 and decayed to 6e-6 and 1e-6 after 300000 and 1000000 steps, respectively.