mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.56k stars 208 forks source link

Request for Semi-Structured Sparse Matrix Support in AWQ Kernel #199

Open pprp opened 5 months ago

pprp commented 5 months ago

Description

I am interested in the capabilities of the AWQ kernel and believe that adding support for semi-structured sparse matrices could significantly enhance its applicability in various machine learning tasks.

Motivation

Semi-structured sparse matrices are a common data structure in fields such as natural language processing, where the sparsity pattern can be partially or fully structured. Implementing support for these types of matrices in the AWQ kernel would allow for more efficient memory usage and potentially faster computations, which are crucial for large-scale applications.

Tutorial of official implmentation of semi_structured_sparse: link

Thank you for considering this feature request.