How are hyper-parameters tuned?

lixin4ever / TNet

Transformation Networks for Target-Oriented Sentiment Classification (ACL 2018)

141 stars 31 forks source link

How are hyper-parameters tuned? #8

Closed bbcfox closed 5 years ago

bbcfox commented 5 years ago

Hi, Li Xin! In your paper, all hyper-parameters are tuned on 20% randomly held-out training data to adjust the hyper-parameter collection.

But, based on your code, I don't observe any separating procedure for training examples. How are hyper-parameters tuned? Only base on test examples?

lixin4ever commented 5 years ago

The code for hyper-parameter tuning is independent to the train-test experiment and thus we do not add it into the codebase.
The process of selecting hyper-parameters is below: (1) We firstly prepare a set of candidate values for each hyper-parameter (e.g., the candidate values for $dim_h$ is [50, 100, 150]). (2) We train the model on the 80% training data and evaluate its performance on the rest training data. (3) The collection of hyper-parameters values obtaining the best performance is used in the train-test experiment. (4) We slightly adjust the batch_size and $n_k$ according to the results on the testing datasets.

bbcfox commented 5 years ago

Thanks for your response. According to the main file, the number of training iteration is set to 100, and you saved all these test results. How can I get the final accuracy and F1 score? Use the mean value or the highest one? As the number of training iteration has significant influence on experiment results for small datasets.

lixin4ever commented 5 years ago

In principle, the final results should be the results obtained in the epoch 100. However, the training is quite unstable due to the small size of the training dataset and thus the number of training iterations (epochs) should be carefully determined.

For TNet-LF, you can pick the results obtained from epoch 25 or 30 (i.e., set the number of training iterations as 25 or 30)

For TNet-AS, you can pick the results obtained from epoch 50, 60, or 70 because we observe that it's more difficult to train TNet-AS.

bbcfox commented 5 years ago

thanks!