castorini / castor

PyTorch deep learning models for text processing
http://castor.ai/
Apache License 2.0
178 stars 58 forks source link

Hyper-parameter tuning for VDPWI #121

Open lintool opened 6 years ago

lintool commented 6 years ago

According to @daemon - the VDPWI works https://github.com/castorini/Castor/tree/master/vdpwi

But the effectiveness is still below STOA because the hyper-parameters haven't been tuned yet.

daemon commented 6 years ago

The old implementation was about 0.5 points off for Pearson's r on the test set -- now it's closer to 2. The biggest changes from the old impl to now would be using torchtext and PyTorch 0.4. The model code itself hasn't changed.

likicode commented 6 years ago

I ran 216 tests on parameters including: decay=[0.99, 0.95], lr=[5E-4, 1E-4], batch_size=[8, 16], momentum=[0, 0.15, 0.05], rnn_hidden_dim=[128, 256, 512], epochs=[10,15,20]. In all cases, I use RMSProp for optimization according to the paper.

The best result on test set for Pearson's r is 0.8707, which is 0.0077 lower than the original result. It's achieved under this param setting: --decay 0.95 --lr 0.0005 --optimizer rmsprop --momentum 0 --epochs 15 --batch-size 8 --rnn-hidden-dim 256.

Among all the "nearly best" results(like 0.8678, 0.8667), they share some same params: --lr 5e-4, --batch-size 8, --epochs 15.

Besides, I also ran some tests with SGD and Adam. Their performance are 1~2 points lower than RMSProp.

Victor0118 commented 6 years ago

Good results! So it is very close to the original paper, right? It means VDPWI-pytorch works! Btw, according to my experience, SGD + good lr is usually the best setup. Could you send a PR to update the readme after you finish the tunning? @likicode

likicode commented 6 years ago

I've updated the readme and sent PR. @Victor0118

likicode commented 6 years ago

I re-ran the best parameter setting (Pearson's r 0.8707) 80 times with different random seeds. The 95% confidence interval is: [0.8625, 0.8644]. Among these 80 results, the highest Pearson'r value is 0.8710 by setting the random seed as 723.

The parameter setting is: --classifier vdpwi --lr 0.0005 --optimizer rmsprop --epochs 15 --momentum 0 --batch-size 8 --rnn-hidden-dim 256

Pearson's r Spearman's p MSE
Original paper 0.8784 0.8199 0.2329
Our result 0.8710 0.8092 0.2501

I also run other parameters 10 times with different random seeds in case missing some potential good settings. There are two parameter sets achieving r value higher than 0.87: a) 0.8705 with 95% confidence interval [0.8621, 0.8667]; b)0.8702 with 95% confidence interval [0.8588, 0.8682].

@lintool To sum up, our best result improves 2 points after parameter tuning, and also very close to the result of the original torch implementation.