neuralmagic / nm-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://nm-vllm.readthedocs.io
Other
251 stars 10 forks source link

update readme about archival #406

Closed andy-neuma closed 2 months ago

andy-neuma commented 2 months ago

SUMMARY: