vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Apache License 2.0
644 stars 52 forks source link

fp8 config for original, default, default kv" #144

Open horheynm opened 1 month ago

horheynm commented 1 month ago

SUMMARY: "please provide a brief summary"

TEST PLAN: "please outline how the changes were tested"

github-actions[bot] commented 1 month ago

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

markurtz commented 2 weeks ago

@horheynm is this still in progress or should we close this out?