Model File Formats: .pth, .bin vs. GGUF

Hello,

I've been exploring the OmniQuant repository and am impressed with the quantization techniques provided for Large Language Models (LLMs). I noticed that the pre-trained models are available in .pth and .bin file formats from huggingface

I was wondering why these models are not available in the GGUF format, which is considered more efficient for handling large models. Is there a specific reason for this choice of file formats? Am I missing something here?

I am sure there is a reason for that I am probably just missing something.

OpenGVLab / OmniQuant

Model File Formats: .pth, .bin vs. GGUF #20