Closed ViktorooReps closed 2 years ago
Hi,
You may try using a smaller learning rate, or you can accumulate gradients for 2 steps.
I have been able to reach 0.81 F1 score at approx. 1800 step. Though only on dev dataset. On test it was around 0.78 at maximum.
To my understanding connl_self_training.sh
does not set optimal hyperparameters used to achieve results mentioned in paper, Could you share the hyperparameters you used to obtain 0.81 on test dataset?
Can you share the details on distant label generation? viktoroo.sch@gmail.com
Hi Viktor, I've sent you an email regarding the distant label generation codes and the reproducibility issue.
Hello, I am having trouble reproducing the performances as well, directly using the .sh scripts doesn't do the trick, could you please provide me with the correct ones? Thank you in advance!
I'm having trouble reproducing your results on CoNNL dataset.
All I have changed is batch sizes:
Loss on eval consistently goes up on second stage of self training. Can you help me figure out what am I doing wrong?