Closed ztl-35 closed 3 years ago
The default hyper-parameters in the code run_typing.py
should be able to reproduce the result in the paper with the same computing infrastructure (32G V100). Try the below grid search if V100 is not available:
batch_size: [16, 32] lr: [2e-5, 3e-5, 5e-5] beta: [0.999, 0.98] weight_decay: [0.01, 0.1] epoch: [3, 4, 5]
hi, thx for your source code. I reproduce your open entity result using your pre-trained model. However, I achieve the best score in open entity dataset is 75.4(F1). The fine-tune code, pre-trained model, and running environment are the same as this github readme.