Closed saswat0 closed 9 months ago
You should probably pack your data, will also be more efficient I think. Separate them with the EOS
@ArthurZucker I'm not sure if I got your point entirely.
For now, I'm flattening the whole batch of tensors before passing in into the model and then reshaping it after getting the output.
Do you mean that I concatenate the sentences together with a
What I mean is that you cannot pass list of list of list of string to a tokenizer. What you need to do is to seperate with a special token like <SEP>
for example.
I am not sure I understand your last comment as the tokenizer does not take tensors as an input. Models usually support batches of input so no need to flattent but if it works alrifht.
Feature request
Can inference be performed with larger batch dimensions? Currently, tokenisation is supported upto List[List[str]] and any dimension higher than that needs to be traversed via a for loop. Can we reduce this requirement and process the entire batch in one go?
Motivation
Tokenisation for a batch of batch of strings has to be done sequentially by iterating over each string. On a GPU, it should be possible to do this in one go, thereby reducing latency and improving utilisation by a large margin.
Currently, the following snippet works fine
But if while encoding a batch of strings for comparison, it fails.
Following is the error (irrespective of the tokenizer)
Is there any way to perform inference at once for a List[List[List[str]]] without iterating one by one?
Your contribution
No contribution. Just raising a request.