pumpikano / tf-dann

Domain-Adversarial Neural Network in Tensorflow
MIT License
628 stars 224 forks source link

having trouble with convergence #21

Open penguinshin opened 6 years ago

penguinshin commented 6 years ago

Hi, first off thank you for the wonderful code. I am trying to replicate the toy blob example in pytorch. I am finding that it unreliably converges to the same accuracies that you report. Sometimes it will not converge at all, and other times it will get to the 97% source/97% target accuracy. Also, the source-only training yields a 50% accuracy on target domain. I was wondering if there were any snags you encountered that hindered convergence?

Thanks

Austin

DRJ2016 commented 6 years ago

I have the same question.

Engineero commented 6 years ago

Does lowering your learning rate help?

pumpikano commented 6 years ago

Actually, the blobs example in general is fairly unreliable - I can get poor results occasionally after repeated runs. Honestly, I didn't do any tuning of hyperparams - it was just a small, fast experiment to validate the implementation when I was writing it. If you find hyperparams that work better, please share them and I can update the example.

penguinshin commented 6 years ago

for me, the biggest discrepancy with the blobs example was that source-only training resulted in trivial (50%) accuracy. did you get this as well? as for hyperparameters, adding more dimensions to the feature extractor (i.e. going from 8 to 50) was what allowed it to converge at all for me.

pumpikano commented 6 years ago

I take it that that 50% was the source accuracy, not the target accuracy? In that case, there is certainly something wrong, but 50% accuracy on the target class is not unusual if you only train on the source.

One thing that might help is annealing the gradient reversal parameter. I do this in the MNIST example, following the schedule presented in the paper, but for the blobs example I keep it fixed at -1 throughout training. That is almost certainly not the optimal thing to do.