issues
search
vectorch-ai
/
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
https://docs.vectorch.com/
Apache License 2.0
377
stars
28
forks
source link
[minor] use available memory to caculate cache_size by default.
#245
Closed
liutongxuan
closed
3 months ago
guocuimi
commented
3 months ago
Thanks looks good to me. two more changes needed.
https://github.com/vectorch-ai/ScaleLLM/blob/9a9a7b5d4f6afd1910463e10ee6d23b64a66d783/scalellm/serve/server_args.py#L54
https://github.com/vectorch-ai/ScaleLLM/blob/9a9a7b5d4f6afd1910463e10ee6d23b64a66d783/src/handlers/llm_handler.h#L69
Thanks looks good to me. two more changes needed.