DocRED Joint extraction (sequence length problem, subtokens)

syuoni / eznlp

Easy Natural Language Processing

Apache License 2.0

130 stars 21 forks source link

Hi,

Thanks a lot for this amazing framework.

I'm working on the deep span representations from ACL2023. I have already adapted to conll2004. I'm trying to adapt the model on docred dataset. I'm facing to a sequence length problem.

/miniconda3/envs/eznlp/lib/python3.8/site-packages/eznlp/model/bert_like.py", line 121, in _token_ids_from_tokenized assert len(sub_tokens) <= self.tokenizer.model_max_length - 2 AssertionError model use : distilroberta-base Do you have an idea for solving this problem?

Best regards,

Sylvain

syuoni / eznlp

DocRED Joint extraction (sequence length problem, subtokens) #37