Load LLM GTPQ safetensor file from folder

saftle commented 3 months ago

I see that you can load individual GGUF files from \models\LLavacheckpoints, however how do you add GPTQ models like https://huggingface.co/TheBloke/U-Amethyst-20B-GPTQ.

GTPQ is much much faster than GGUF. I'm currently loading them with https://github.com/Zuellni/ComfyUI-ExLlama-Nodes but would love to switch to VLM Nodes, since there are alot more features with this node pack.

VLM Nodes supports Auto GPTQ when loading https://huggingface.co/internlm/internlm-xcomposer2-vl-7b-4bit, so I assume that it could load other GPTQ models as well. I just have no idea what the directory structure should look like and/or if I have to rename the safetensors to be a specific filename.

Any help would be awesome!

gokayfem commented 3 months ago

i think this is an llm, not vlm, i need to make a node for this. i may or may not add this functionality. im not sure.

saftle commented 3 months ago

@gokayfem but you have an LLMLoader already loading GGUFs which works great btw, it just doesn't load GPTQs :P

drphero commented 3 months ago

Just be aware that this will probably require a specially compiled version of llama-cpp-python in order to utilize the GPU. It's doable, but a massive headache, at least on windows.

gokayfem / ComfyUI_VLM_nodes

Load LLM GTPQ safetensor file from folder #57