Closed saftle closed 1 month ago
i think this is an llm, not vlm, i need to make a node for this. i may or may not add this functionality. im not sure.
@gokayfem but you have an LLMLoader already loading GGUFs which works great btw, it just doesn't load GPTQs :P
Just be aware that this will probably require a specially compiled version of llama-cpp-python in order to utilize the GPU. It's doable, but a massive headache, at least on windows.
I see that you can load individual GGUF files from
\models\LLavacheckpoints
, however how do you add GPTQ models like https://huggingface.co/TheBloke/U-Amethyst-20B-GPTQ.GTPQ is much much faster than GGUF. I'm currently loading them with https://github.com/Zuellni/ComfyUI-ExLlama-Nodes but would love to switch to VLM Nodes, since there are alot more features with this node pack.
VLM Nodes supports Auto GPTQ when loading https://huggingface.co/internlm/internlm-xcomposer2-vl-7b-4bit, so I assume that it could load other GPTQ models as well. I just have no idea what the directory structure should look like and/or if I have to rename the safetensors to be a specific filename.
Any help would be awesome!