When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred.
thread '<unnamed>' panicked at src/lib.rs:227:33:
called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
File "/home/yang/teststack_1m.py", line 37, in <module>
encoded_tokens = your_object.encode_request(request_id=request_id, prompt=prompt)
File "/home/yang/teststack_1m.py", line 19, in encode_request
prompt_token_ids = self.tokenizer.encode(prompt, add_special_tokens=True)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2600, in encode
encoded_inputs = self.encode_plus(
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3008, in encode_plus
return self._encode_plus(
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 719, in _encode_plus
first_ids = get_input_ids(text)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 686, in get_input_ids
tokens = self.tokenize(text, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 617, in tokenize
tokenized_text.extend(self._tokenize(token))
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 88, in _tokenize
ids = self.tokenizer.encode(text)
File "/usr/local/lib/python3.10/site-packages/tiktoken/core.py", line 124, in encode
return self._core_bpe.encode(text, allowed_special)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: RuntimeError(StackOverflow)
When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred.