sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Apache License 2.0
2.75k stars 177 forks source link

Why using 16 bit dtype in memory pool state? #570

Open yileld opened 3 days ago

yileld commented 3 days ago

In python/sglang/srt/memory_pool.py TokenToKVPool self.mem_state = torch.zeros((size,), dtype=torch.int16, device="cuda") but ReqToTokenPool self.mem_state = torch.ones((size,), dtype=torch.bool, device="cuda") They are different, can I change all dtype into torch.bool?