google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Apache License 2.0
32 stars 14 forks source link

Leverage tokens_utils to process result tokens #90

Closed FanhaiLu1 closed 3 months ago

FanhaiLu1 commented 3 months ago

Leverage token_utils to process the result tokens. This simplify the logic of run_interactive.py and make sure the result are consistency from run_interactive vs online.