microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
354 stars 31 forks source link

Add abstraction for QuarotFP16Linear layers in RTN quantization #152

Closed pashminacameron closed 2 months ago

pashminacameron commented 3 months ago

Added abstraction for module names. Needs testing