Compression techniques - Githubissues

erickdp commented 3 months ago

I find it truly fascinating! Have you come across any methods similar to pruning, distillation, or quantization that could be applied to this model? While I'm aware of some size options, it would be truly remarkable if we could utilize compression techniques for more efficient processing and deployment on edge devices.

urchade commented 3 months ago

Hi @erickdp thanks

Indeed it would be interesting, however I am not really familiar with this field. Do you have any idea how we could do that?

erickdp commented 3 months ago

I could recommend you Knowledge Distillation method, that consists in fitting "students" models from "teachers" since it is mentioned that GLiNER handles a BERT-like architecture, I have used it to distill sentiment classification models and the result is really efficient, from the computational process needed, model size and accuracy.

However, excellent contribution.

References: https://neptune.ai/blog/knowledge-distillation https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student

urchade / GLiNER

Compression techniques #37