How to reproduce DRNN+A+WS?

FlyingCat-fa commented 2 years ago

I want to reproduce DRNN+A+WS in the paper, without bert_embedding. Could you tell how to do that ? Thanks.

xmshi-trio commented 2 years ago

BERT achieves better performance, thus BERT embedding is utilized in the code of this version. If you want to use glove, you need to change a little more code. I recommend using BERT.

FlyingCat-fa commented 2 years ago

BERT achieves better performance, thus BERT embedding is utilized in the code of this version. If you want to use glove, you need to change a little more code. I recommend using BERT.

Thanks. I have reproduced the DRNN results. I am reproducing the BERT. I find the grid search is utilized to select bert hyper parameters. Could you tell me the best hyper parameters, including batch size, epoch, learning rate, or more other parameters ?

xmshi-trio commented 2 years ago

The learning rate is 5e-2, and the eps is 1e-2.

FlyingCat-fa commented 2 years ago

The learning rate is 5e-2, and the eps is 1e-2.

Thanks. I reproduced the BERT-TST with the hyper parameters, and select the best batch size(32). But I cannot reproduce the results. My results are : Precision Recall Micro F1 Macro F1 Accuracy 90.99 85.92 88.38 85.39 71.8. The results are quite different from those in your paper, especially Recall and Macro F1.

A similar difference is observed when reproducing BERT-Raw.

xmshi-trio commented 2 years ago

Sorry for the late response. I am busy with the EMNLP paper submission.

The BERT learning rate is set as 5e-2, random seed is set as 2020, batch size is set as 32, pretrain_num_epochs is set as 10, num_epochs is set as 40, and self_num_epochs is set as 100.

The coda version is 11.2. Our experiments are conducted on the workstation with an Intel Xeon E5 2.40 GHz CPU, 128 GB memory, an NVIDIA 2080 ti GPU, and CentOS 7.2.

If you have any question, please do not hesitate to contact us.

FlyingCat-fa commented 2 years ago

Sorry for the late response. I am busy with the EMNLP paper submission.

The BERT learning rate is set as 5e-2, random seed is set as 2020, batch size is set as 32, pretrain_num_epochs is set as 10, num_epochs is set as 40, and self_num_epochs is set as 100.

The coda version is 11.2. Our experiments are conducted on the workstation with an Intel Xeon E5 2.40 GHz CPU, 128 GB memory, an NVIDIA 2080 ti GPU, and CentOS 7.2.

If you have any question, please do not hesitate to contact us.

Thank you. Your reply is very detailed. Good luck with your submission.

xmshi-trio / MSL

How to reproduce DRNN+A+WS? #6