issues
search
AI-Hypercomputer
/
JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Apache License 2.0
202
stars
26
forks
source link
Align Tokenizer in JetStream
#40
Closed
JoeZijunZhou
closed
5 months ago
JoeZijunZhou
commented
5 months ago
Return token ids instead of custom decode operation result in JetStream; let client do tokenizer decode operation
Update benchmark script and requester tool to use JetStream tokenizer seqio library for tokenizer decode operation
Use correct method to let tokenizer decode the whole output token id list
Update unit tests for the above change
Enforce python type check to benchmark script
Update README unit test section