Closed pwldj closed 5 years ago
XLNet-base model usually underperform XLNet-large model in various task. I haven't run experiment using base model, but I think 91.3 f1 makes sense for CoNLL2003 NER task
Thinks, I think it's impossible to run large version on my GPU. XD
Ok, I try all my way on Xlnet-base but even cannot get 92.0 F1, Its so hard to understand how Bert-base can get 92.4.
Maybe you can try out some open-source BERT-NER replementation to repro 92.4? Actually I implemented this XLNet-NER tool by referencing those BERT-NER reimplementation
On Mon, Aug 26, 2019 at 1:53 AM pwldj notifications@github.com wrote:
Ok, I try all my way on Xlnet-base but even cannot get 92.0 F1, Its so hard to understand how Bert-base can get 92.4.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/stevezheng23/xlnet_extension_tf/issues/57?email_source=notifications&email_token=ABYXYM7RW2USILDIYYIYTF3QGOKXBA5CNFSM4INZ77G2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DX6JI#issuecomment-524779301, or mute the thread https://github.com/notifications/unsubscribe-auth/ABYXYM4HNNNV5VC6MH4OWVLQGOKXBANCNFSM4INZ77GQ .
-- Best, Mingzhi
Moreover, you can also try turn on/off fine-tuning core XLNet model by modifying the source code
On Mon, Aug 26, 2019 at 7:55 AM Steve Zheng stevezheng23@gmail.com wrote:
Maybe you can try out some open-source BERT-NER replementation to repro 92.4? Actually I implemented this XLNet-NER tool by referencing those BERT-NER reimplementation
On Mon, Aug 26, 2019 at 1:53 AM pwldj notifications@github.com wrote:
Ok, I try all my way on Xlnet-base but even cannot get 92.0 F1, Its so hard to understand how Bert-base can get 92.4.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/stevezheng23/xlnet_extension_tf/issues/57?email_source=notifications&email_token=ABYXYM7RW2USILDIYYIYTF3QGOKXBA5CNFSM4INZ77G2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DX6JI#issuecomment-524779301, or mute the thread https://github.com/notifications/unsubscribe-auth/ABYXYM4HNNNV5VC6MH4OWVLQGOKXBANCNFSM4INZ77GQ .
-- Best, Mingzhi
-- Best, Mingzhi
Yes, repro without knowing author's implementation detail is sometime hard
I try to run on base size XLnet,128seq len,32 bsz and 2000times. but I can only get 91.3 f1 with conlleval perl version. is it right?