Closed CharlieFRuan closed 4 days ago
web-xgrammar
ResponseFormat.type == "grammar"
grammar_init_ms
grammar_per_token_ms
CompletionUsage.extra
time_to_first_token_s
time_per_output_token_s
e2e_latency_s
ignore_eos
Completion
ChatCompletion
0.18.0-dev2
Change
web-xgrammar
ResponseFormat.type == "grammar"
, where you specify an EBNF grammar stringgrammar_init_ms
andgrammar_per_token_ms
toCompletionUsage.extra
when using grammartime_to_first_token_s
(TTFT) andtime_per_output_token_s
(TPOT),e2e_latency_s
toCompletionUsage.extra
ignore_eos
toCompletion
andChatCompletion
requestsTVMjs
0.18.0-dev2
just like 0.2.71