Open stevezheng23 opened 5 years ago
Great. It looks like the results on NER are a bit behind the current SoTA, which is over 93. It would be great to see whether the hparams or implementation could be improved.
Yes, the result is from the initial run, I haven't tuned the hparams yet.
On Thu, Jun 27, 2019 at 1:10 AM Zhilin Yang notifications@github.com wrote:
Great. It looks like the results on NER are a bit behind the current SoTA, which is over 93. It would be great to see whether the hparams or implementation could be improved.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zihangdai/xlnet/issues/68?email_source=notifications&email_token=ABYXYM7GMJXUP7SEUREXS63P4RYWJA5CNFSM4H3ZHHUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYWKB5Y#issuecomment-506241271, or mute the thread https://github.com/notifications/unsubscribe-auth/ABYXYM4LRT7EQTL6H2DBRVTP4RYWJANCNFSM4H3ZHHUA .
@stevezheng23 I've one question regarding to the NER implementation: have you also experiment with using different layers? E.g. the BERT paper (table 7) uses a feature-based approach with a concatenation of the last four layers. Could you give some details what layers you're using in your repo? Thanks :)
@stefan-it In the initial experiments, I just finetuned the XLNet model by adding a dense + softmax layer on top of the last layer. For feature-based approach, I have't done corresponding experiments yet.
@stefan-it One off-topic question please, in "concatenation of the last four layers", does "last four layers" means 9,10,11,12 layer?
@mcggood Yes :) Btw: here are the results for the feature-based approach from the BERT paper:
Here is the XLNet extension project which includes a XLNet-NER implementation, https://github.com/stevezheng23/xlnet_extension_tf.
This XLNet extension project is currently importing zihangdai/xlnet repo as its submodule, maybe we can consider merging it into the main repo?