ShannonAI / mrc-for-flat-nested-ner

Code for ACL 2020 paper `A Unified MRC Framework for Named Entity Recognition`
656 stars 117 forks source link

a bug in mrc.utils.py #25

Open scarydemon2 opened 4 years ago

scarydemon2 commented 4 years ago

https://github.com/ShannonAI/mrc-for-flat-nested-ner/blob/0505c263a6a3868713e3abcd29856a931ba1a365/data_loader/mrc_utils.py#L145 may be the "max_tokens_for_doc" should be replaced by "max_seq_length". Because the "doc span pos" matrix is limited by the max_seq_length

JaeZheng commented 4 years ago

I find this bug too and agree with you. I think it should be if len(query_tokens)+2+offset_idx_dict[int(s_idx)] <= max_seq_length and \ or if offset_idx_dict[int(s_idx)] <= max_tokens_for_doc and \.

ghost commented 4 years ago

Apologies for the late reply.

Thanks for pointing out my mistake. Yes, this is a bug introduced when I was trying to clean my codebase. I fixed it in the commit (f80ed26). Please pull the latest repo.

Many Thanks!