heshengtao / comfyui_LLM_party

LLM Agent Framework in ComfyUI includes Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai/gemini interfaces, such as o1,ollama, grok, qwen, GLM, deepseek, moonshot,doubao. Adapted to local llms, vlm, gguf such as llama-3.2, Linkage neo4j KG, graphRAG / RAG / html 2 img
GNU Affero General Public License v3.0
1.06k stars 94 forks source link

Unable to import picture #88

Closed newbie-comfyui closed 2 months ago

newbie-comfyui commented 2 months ago

image

WARNING: LLM_local_loader.IS_CHANGED() got an unexpected keyword argument 'model_name' Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.83s/it] The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:128009 for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Please provide the image you would like me to describe. If you don't have an image in mind, I can suggest a scene for you to describe:

A serene forest path during the golden hour, with sunlight filtering through the dense canopy of trees, casting dappled shadows on the forest floor. A gentle stream runs alongside the path, its crystal-clear waters reflecting the warm hues of the setting sun. The air is filled with the sweet scent of blooming wildflowers, and the distant chirping of birds adds to the tranquil atmosphere.

newbie-comfyui commented 2 months ago

HELP ME,PLESE!

heshengtao commented 2 months ago

Omost doesn't have vision.

heshengtao commented 2 months ago

If you want to use the visual model, you need to use the API version of the node, and then use llama.cpp to transfer or ollama to transfer. Or directly use the visual model such as gpt4o.