Open lxy444 opened 5 years ago
Download the pretrained chinese bert and convert it to PyTorch version.
Preprocess the MSRA dataset
l Split sentences to avoid too much truncation. Especially in the test phase, truncation will harm the scores.
l Turn chunk level label to bert_token level label, for example:
n 希望工程/o -> 希/O 望/O工/O程/O
n 北京市/ns -> 北/B-NS京/I-NS市/I-NS
l You can refer to ‘preprocess_msra.py’
发件人: 李向阳mailto:notifications@github.com 发送时间: 2019年3月23日 17:17 收件人: ericput/bert-nermailto:bert-ner@noreply.github.com 抄送: Subscribedmailto:subscribed@noreply.github.com 主题: [ericput/bert-ner] How to reproduce the performance result (#1)
Could you please add some guide on how to reproduce the performance result?
For example, how to reproduce the result on the MSRA dataset?
Thanks.
― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ericput/bert-ner/issues/1, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJy2y3BibuLOwweqf5KM_I2N-vKtR2wsks5vZfE6gaJpZM4cEz04.
I implemented a new version using BERT+CRF, If you think my version is good, give it a star please. @lxy444
I implemented a new version using BERT+CRF, If you think my version is good, give it a star please. @lxy444
OK, thanks.
- Download the pretrained chinese bert and convert it to PyTorch version. 2. Preprocess the MSRA dataset l Split sentences to avoid too much truncation. Especially in the test phase, truncation will harm the scores. l Turn chunk level label to bert_token level label, for example: n 希望工程/o -> 希/O 望/O工/O程/O n 北京市/ns -> 北/B-NS京/I-NS市/I-NS l You can refer to ‘preprocess_msra.py’ 3. Just follow the ‘task_config.yaml’. 发件人: 李向阳mailto:notifications@github.com 发送时间: 2019年3月23日 17:17 收件人: ericput/bert-nermailto:bert-ner@noreply.github.com 抄送: Subscribedmailto:subscribed@noreply.github.com 主题: [ericput/bert-ner] How to reproduce the performance result (#1) Could you please add some guide on how to reproduce the performance result? For example, how to reproduce the result on the MSRA dataset? Thanks. ― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#1>, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJy2y3BibuLOwweqf5KM_I2N-vKtR2wsks5vZfE6gaJpZM4cEz04.
Thanks, I follow your instruction and after finished training, I got a prediction result file named "test.predict". I guess it should be the predicted label of the test data.
However, I don't see any evaluation metric performance, even in the training stage, there are not any accuracy output. How could I get the performance result?
Could you please add some guide on how to reproduce the performance result?
For example, how to reproduce the result on the MSRA dataset?
Thanks.