Efficient-ML / Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
1.87k stars 206 forks source link

About the research direction #58

Closed TianGao-NJUST closed 1 week ago

TianGao-NJUST commented 1 week ago

Hello, Dr. Qin: Thank you very much for maintaining this repository. You have provided a quick start guide for the field. I have been very confused recently because since 2024, the binarization field has almost entirely shifted towards the domain of LLM. Is there still any significance in researching ordinary backbone models?

best regards

htqin commented 1 week ago

Hi Tian,

Nice to hear from you! I still hold a positive attitude toward your mentioned topic, i.e., binarization for ordinary backbones (cnn/transformers/...). Binarization is very friendly to low-power devices (refer to some analysis in the BiBench paper), especially for the inference on mobile and edge hardware, so it has significance for these scenarios. For LLMs, the binarization pipeline (especially considering it's based on large-scale pre-training), scope, and inference are much different from those for ordinary backbones, and the related research is still in the beginning stage. So I think both of them have their own value of research, and this repo will continuously collect papers in both areas.

best, Haotong