With high concurrency the fast tokenizer (Rust) hits "Already borrowed" exceptions with the truncation and padding (and max length) params change. Do not change those. Detect where to truncate (or return message) using the default settings and some code. Truncate texts and re-tokenize when needed.
In addition, the error message which used to report number of tokens over the limit (but did not say which sentence) now returns the index(es) of the sentence(s) that was/were found to be too long.
With high concurrency the fast tokenizer (Rust) hits "Already borrowed" exceptions with the truncation and padding (and max length) params change. Do not change those. Detect where to truncate (or return message) using the default settings and some code. Truncate texts and re-tokenize when needed.
In addition, the error message which used to report number of tokens over the limit (but did not say which sentence) now returns the index(es) of the sentence(s) that was/were found to be too long.