taissirboukrouba / Structured-Information-Retrieval-with-LLMs

Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer
1 stars 0 forks source link

Parser memory limit #2

Closed taissirboukrouba closed 3 months ago

taissirboukrouba commented 3 months ago

ValueError: [E088] Text of length 1020658 exceeds maximum of 1000000. The parser and NER models require roughly 1GB of temporary memory per 100,000 characters in the input. This means long texts may cause memory allocation errors. If you're not using the parser or NER, it's probably safe to increase the nlp.max_length limit. The limit is in number of characters, so you can check whether your inputs are too long by checking len(text).

taissirboukrouba commented 3 months ago

Solved : Increased parser limit by 100 inside the lemmatize_text() function : nlp.max_length = len(text) + 100