Open jondot opened 2 weeks ago
Yes, this is because the Tokenizer
used by the pipeline needs to be mut
in order to be able to create it with the padding configuration. It's not ideal, and I'll try to change this, so the pipeline doesn't require to be mutable.
Hi again! From what I noticed, pipelines are mut when
run
(mut self
) which makes me wonder how to keep one in a server, and reuse it (just callrun
on every new request). Digging in I see this is mainly because of tokenizer being mut, and some other variables, but that there is no actual state being saved -- no worry of one inference influencing the next result of the next inference. So it's enough to put it under a mutex and then the instance itself is shareableIs that correct so far?