heshengtao / comfyui_LLM_party

LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai/gemini interfaces, such as o1,ollama, grok, qwen, GLM, deepseek, moonshot,doubao. Adapted to local llms, vlm, gguf such as llama-3.2, Linkage neo4j KG, graphRAG / RAG / html 2 img
GNU Affero General Public License v3.0
1.06k stars 94 forks source link

Please, add local LVM model support and example [feature request] #12

Closed bigcat88 closed 6 months ago

bigcat88 commented 6 months ago

Any model will do, even a simple one for start.

Most people in ComfyUI will be interested in a model that can determine the GENDER of an object: boy or girl/man or woman

This is very useful for InstantID and IP-Adapter workflows, where you want regenerate a picture.

Also VLM nodes will be very useful for upscaler workflows and for transfer styles ones.

As I am very interested in LVM Nodes, I can try create one and open a PR for this, I will have some time at evenings during week, so at next weekend probably can open a PR.

heshengtao commented 6 months ago

That’s really great, I also think that supporting LVM is an indispensable feature.

heshengtao commented 6 months ago

I’ve been a bit busy this past week, mainly spending time creating a tutorial video for this project and fixing some minor issues within the project. It was only during the weekend that I had a substantial amount of time. May I ask if you have already started writing the PR related to the LVM model? If so, I won’t duplicate the development. Instead, I plan to create something that can package the comfyui workflow into an OpenAI interface, making it convenient for any software that can integrate with OpenAI to access user-defined workflows. If you haven’t started yet, I can begin writing one, and you can provide your feedback on it later.

bigcat88 commented 6 months ago

Unfortunately, I had exactly the same situation, and I didn’t have time all week. And I expecting next week will be full of work too. :( On this topic, I was only able to check the nodes from here https://github.com/gokayfem/ComfyUI_VLM_nodes and the nodes for the local version of Ollama - https://github.com/stavsap/comfyui-ollama

I can say that specifically for those situations where we use ComfyUI, the approach with another local server is not suitable, since then control over unloading the model is lost and Comfi’s workflows end up waiting for the computer with the ollama server to process the request. And this adds the requirement to either have a second computer or Ollama to have to run on the CPU if a person has only one computer.

The way you described with a universal interface, when you can outsource some of the tasks to another service, is quite an interesting and good solution, imho :)

heshengtao commented 6 months ago

When I have time, I will start working on LVM-related tasks, beginning with the LLaVA model :)

heshengtao commented 6 months ago

Sorry for the wait, I have adapted this model: llava-llama-3-8b-v1_1-gguf. The example workflow can be found here: start_with_LVM.json.Due to the use of llama_cpp_python code, it may not be perfectly compatible with MPS devices. You can see the adaptation code I made for different devices here. I’m not sure if it will cause errors on macOS and MPS, and I would greatly appreciate your help. 🙂

bigcat88 commented 6 months ago
image

llama_cpp_python is working fine, there are just a problem with installing it from default Python's pipy(but it is available in GH releases).

Surprisingly, it worked with int4 - I didn’t even have to do anything :)

Good job, really.

Issue now can be closed?

bigcat88 commented 6 months ago

Just a note: will be good not to display a encoded image data in "history"

A few big images with 10-25 MB will make history totally non displayable in browser.

heshengtao commented 6 months ago

The issue with the history has been resolved, and the download source for llama_cpp_python in macOS has been adjusted. The problem is now solved.I sincerely appreciate your assistance. Should you have any further recommendations or if there’s anything else I might require, I would be grateful if you could kindly reach out to me.:)