How to handle long encoding data, more than max_length?

yellowjs0304 commented 2 years ago

@NielsRogge
Hi, I have some question, I used LayoutXLM and finetuning with my own data. I have question, I know the pre-trained model is training with max-encoding length = 512. But what should i do If I need to inference long images with this model. I heard the original repo(LayoutXLM) is solving this issue with truncating the input data with two parts. (if total length is 617, truncate it with two inputs, like 512 + 102 sequence input datas)

Is there any option like these in Huggingface? or this layoutxlm processor?? I look forward to receiving any ideas. Thank you :)

Navd15 commented 2 years ago

Hi @yellowjs0304 in the processor mixin there is an option for return_overflowing_tokens it is by default False. It will return more than 1 sequence pair according to the max_length specified for the processor. https://huggingface.co/docs/transformers/internal/tokenization_utils https://huggingface.co/course/chapter6/3b?fw=pt Also refer to above links for more info. Hope this helps

yellowjs0304 commented 2 years ago

Thank you. I'll check it.

NielsRogge / Transformers-Tutorials

How to handle long encoding data, more than max_length? #189