sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sglang.readthedocs.io/en/latest/
Apache License 2.0
5.11k stars 356 forks source link

[Feature] RadixCache: remove recursive logic #813

Open hnyls2002 opened 1 month ago

hnyls2002 commented 1 month ago

Motivation

When the chunked prefill size is too small and the prefill length is too long, python would reach maximum recursion depth in the radix cache.

Same as proposed in #154

jon-chuang commented 1 month ago

Hello @hnyls2002 please assign this to me

jon-chuang commented 1 month ago

I'm just wondering, why did you choose to Implement radix trie in python rather than use off the shelf library?