A high-performance inference system for large language models, designed for production environments.
316
stars
23
forks
source link
feat: added id_to_token for tokenizer to handle unfinished byte sequence, ending with "�" #238
Closed
guocuimi closed 3 weeks ago