Closed tungngthanh closed 3 years ago
You can execute the following command:
python -u run.py train \
-p \
-d 0 \
--feat bert -f exp/ptb.bert.crf \
--mbr \
--fembed data/glove.6B.100d.txt \
--unk unk
Thanks for your reply. I just check the file "config.ini", the default values are
bert_model ='bert-base-cased' n_bert_layers=4
So for this setting, what is the result I would expect? Experiments in the paper are with "bert-large-cased", right?
Yeah, I use bert-large-cased
to be consistent with kitaev et al. 2019.
Thank you for the fast reply. do you think we should allow gradient in bert_embedding? I am quite sure that kitaev et al. 2019 use the gradient. Your models only learn the scalar mix between 4 last layers. If you allow the gradient (kinda fine-tuning I think), the results can be better.
Actually, using BERT with frozen parameters is enough. I have conducted some experiments on BERT fine tuning and does not obtain considerable gains.
Sorry for disturbing you again. I run exactly the command you give me but cannot get the expected result.
max score of dev is 94.28% at epoch 193 the score of test at epoch 193 is 94.09%
The only thing I did may differ from you is that I use the dataset from kitaev et al. 2019 directly: train/dev/test: 02-21.10way.clean, 22.auto.clean, 23.auto.clean So do you think it can be a problem?
Sorry for late reply and thanks for reporting this issue. I'm working to troubleshoot the problem but still haven't figured it out... Another implementation of crf constituency parser is integrated in this repo and the code behaves normally by following the training instructions. You can check it out.
After checking the parser repo, I see that the main difference between the bert embedding in that repo and this repo is the drop out function. And I run the experiments with that repo and it works very well. Sorry for troubling you one more time. Can you share with me the CTB datasets and how to reproduce the experimental results?
Yeah, thanks. I have fixed the bug, but I forgot about it :(. I can't share you the data which may raise the copyright issue. However, you can extract ctb files following this repo.
Yeah, I followed the instruction there, I want to check with you about the stats for ct 5.1: in syntactic distance parser, they use split from Liu and Zhang, 2017 which has the number of sentences in train/dev/test sets are 17544/352/348. They are a bit different from yours: the training set has 18104 sentences. Can you guide me to reproduce your dataset? The produce data from CTB 8.0 They split it based on sentence ids as followings:
training = list(range(1, 270 + 1)) + list(range(440, 1151 + 1)) development = list(range(301, 325 + 1)) test = list(range(271, 300 + 1))
Do you happen to know the id splitting to generate your data? I am so sorry for troubling you from time to time.
I remember that some articles were missing in that dump. The article ids are discontinuous..
Hi, Your work is exciting. I can reproduce non-pretrained parsing results quite fast and efficiently. However, I can not reproduce the constituency parsing results with Bert. For the bert constituency parsing, I run
python run.py train --device 0 --feat bert --file exp/ptb.bert
And get the result:For pretrained, we should get something around 95.59. Can you guide me to reproduce the results?