cle-ros / RoutingNetworks

Apache License 2.0
65 stars 17 forks source link

Reproducing results in the paper #2

Open tianheyu927 opened 5 years ago

tianheyu927 commented 5 years ago

Hi,

I ran the script provided in PytorchRouting/Examples/run_experiments.py and was unable to get the results of CIFAR-MTL reported in the paper (I'm getting ~53% while the paper reports 70%). I notice that there's a comment in run_experiments.py saying WPL_routed_all_fc(3, 512, 5, dataset.num_tasks, dataset.num_tasks) Training averages: Model loss: 0.427, Routing loss: 8.864, Accuracy: 0.711 Testing averages: Model loss: 0.459, Routing loss: 9.446, Accuracy: 0.674 I wonder if you are using a different set of hyperparameters in the paper and willing to share the hyperparams with me? Thanks!

cle-ros commented 5 years ago

Hi. I am aware of this problem. It was the consequence of a major rewrite I did to speed up the code (from loop-based routes within a batch to mask-based routes), which changed the exact impact of the hyperparameters slightly. I have verified the correctness of the new code for our new paper (https://nlp.stanford.edu/pubs/cases2019recursiverouting.pdf), so I can guarantee that the code works. I will try and do a new hyperparameter sweep as soon as I can.