Closed ViktorooReps closed 2 years ago
Ran with default hyperparameters and lr_decay=0.2. Got results:
The best dev F1: 77.9107505070994
The corresponding test: [96.91211401425178, 57.79036827195468, 72.40461401952085]
Thanks for the interest. I'm trying to reproduce your error as well. Did you use pretrained word embeddings?
BTW, I think the parameters in the current repo is the one that I use
If you look at this graph, I think the F1 is somewhat similar to what I have? Though it's a soft approach in the figure
Thank you for the reply!
To my understanding results are not really similar to yours. Here is the exact call to training script I used: python3 main.py --dataset conll2003 --variant soft --device=cuda:0 --num_epochs=20 --lr_decay=0.2
. I expected to see results somewhat close to Our Soft
at 0.5 $\rho$ mark. But really results that I got are more similar to Simple approach.
I did not try to run the original DyNet version of your repo. I will come back to you as soon as I get the results.
I did download the pretrained GloVe embeddings and put them under data/
folder. To my understanding the 100dim vectors were used by default.
Thank you for the reply!
To my understanding results are not really similar to yours. Here is the exact call to training script I used:
python3 main.py --dataset conll2003 --variant soft --device=cuda:0 --num_epochs=20 --lr_decay=0.2
. I expected to see results somewhat close toOur Soft
at 0.5 $\rho$ mark. But really results that I got are more similar to Simple approach.I did not try to run the original DyNet version of your repo. I will come back to you as soon as I get the results.
Sorry about that. But the PyTorch version is only for hard version. For soft version, I don't think it's working. (Sorry about that I didn't really make the soft approach working in PyTorch version yet.)
But the hard approach should be working
Oh my bad, I got the impression that you've finished the soft version already (from the code + issues) but didn't bother to update the project description.
I will be trying the DyNet version then. Thank you for your time!
For anyone interested I have managed to run DyNet version. You can find Dockerfile with environment setup here: https://github.com/ViktorooReps/partial_annotation. I was not able to accelerate training with GPU though unfortunately (DyNet uses the device, yet training time goes up). And CPU traning only utilizes one core...
I want to use your model as a baseline in my future paper, but unfortunately I cannot uitlize the results reported in your paper as I explore the smaller range of labelled entities fraction: 5-15%.
So I will need to run experiments with your code on this range of labelled entities. Thanks a lot for an implementation in PyTorch by the way!
You have mentioned in the other issue that fine hyperparameter tuning is not required to achieve comparable results, yet I would still appreciate you sharing the optimal hyperparameters (at least somewhat close to the ones you used in your paper).