Thanks for your research, I have been trying your model and found it very impressive and useful.
I made a public model in huggingface quantized to 4 bits with Bitsandbytes as I found some people asking for it.
Some details about the minimum GPU required is 7GB RAM for single inference.
If someone finds it useful please feel free to use it.
Thanks for your research, I have been trying your model and found it very impressive and useful. I made a public model in huggingface quantized to 4 bits with Bitsandbytes as I found some people asking for it.
Some details about the minimum GPU required is 7GB RAM for single inference.
If someone finds it useful please feel free to use it.