intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.19k stars 253 forks source link

Feature Request: Pruning Callback for pytorch lightning #1078

Closed lminer closed 11 months ago

lminer commented 1 year ago

Pytorch lighning is very popular among torch users. It would be great to have a pruning callback for pytorch lightning.

PenghuiCheng commented 1 year ago

Hi, Lminer, glad to see your suggestion. We will first evaluate whether the current API meets the requirements of lighting.

chensuyue commented 1 year ago

Hi @lminer after some investigate, we decided not support pytorch lightning in short time. Thanks again for your comments!

clementpoiret commented 11 months ago

@lminer I had to develop my own callbacks, they are not extensively tested because they are very recent and because I tested them only on toy models, but you can find them here to try :) https://github.com/clementpoiret/lightning-nc

I set the requirements fairly high, but I believe they could be lowered.

lminer commented 11 months ago

Wow, that looks great!