triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
527 stars 225 forks source link

Fix input token calculation #625

Closed tgerdesnv closed 2 months ago

tgerdesnv commented 2 months ago

Before this fix, running openai completions endpoint was always reporting 1 for input tokens. After a partial fix, it was always off by 1 With both fixes combined in this PR, the reported input tokens is now correct.