How did you convert to GGML?

AlpinDale / pygmalion.cpp

C/C++ implementation of PygmalionAI/pygmalion-6b

MIT License

56 stars 5 forks source link

How did you convert to GGML? #8

Open Ph0rk0z opened 1 year ago

Ph0rk0z commented 1 year ago

I'd like to convert other GPT-J to GGML but can't find any script for it.

AlpinDale commented 1 year ago

You can use this script.

Ph0rk0z commented 1 year ago

That won't work for a 4-bit quantized model. It just converts a HF to 16 or 32bit.

AlpinDale commented 1 year ago

That won't work for a 4-bit quantized model. It just converts a HF to 16 or 32bit.

Yes, it won't quantize. The original ggml code has the code for quantizing, with instructions on how to do it. I have the code here as well, but it might be slightly outdated compared to the original. You'll need to compile the quantize.cpp file instead of the pyggy.cpp.

Ph0rk0z commented 1 year ago

Oh.. so no python. I have to find a way to build these then. I have the 4-bit model already. Was hoping there was a GPTQ-to-GGML that didn't need a tokenizer.model

Ph0rk0z commented 1 year ago

I built it on linux and it quantized gpt4chan. I'd upload the binary but it won't let me attach it. What is the difference between the q4_0 and q4_1?