Closed lvhan028 closed 5 months ago
Hi, @cmaureir Is there any update?
Can you add more information about the project? you just added the error message on that section
Sure @cmaureir
LMDeploy is a toolkit for compressing, deploying, and serving LLM. It supports both LInux-x86_64, Linux-aarch64 and Windows platforms. The project is developed by C++, CUDA and Python. It has the following core features:
Efficient Inference: LMDeploy delivers up to 1.8x higher request throughput than vLLM, by introducing key features like persistent batch(a.k.a. continuous batching), blocked KV cache, dynamic split&fuse, tensor parallelism, high-performance CUDA kernels and so on.
Effective Quantization: LMDeploy supports weight-only and k/v quantization, and the 4-bit inference performance is 2.4x higher than FP16. The quantization quality has been confirmed via OpenCompass evaluation.
Effortless Distribution Server: Leveraging the request distribution service, LMDeploy facilitates an easy and efficient deployment of multi-model services across multiple machines and cards.
Interactive Inference Mode: By caching the k/v of attention during multi-round dialogue processes, the engine remembers dialogue history, thus avoiding repetitive processing of historical sessions.
@cmaureir would appreciate it if this request could be accepted. Is there anything I can do? Feel free to ask
This issue used the wrong template.
Project URL
https://pypi.org/project/lmdeploy/
Does this project already exist?
New Limit
30000
Update issue title
Which indexes
PyPI
About the project
ERROR HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/ Project size too large. Limit for project 'lmdeploy' total size is 10 GB. See https://pypi.org/help/#project-size-limit
Reasons for the request
30GB
Code of Conduct