Image To Prompt issue - Githubissues

if-ai / ComfyUI-IF_AI_tools

ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.

https://ko-fi.com/impactframes

366 stars 27 forks source link

Image To Prompt issue #9

Closed MMoneer closed 2 months ago

MMoneer commented 2 months ago

Hi, The Image to Prompt doesn't work correctly for generating images from the output prompt, it loops without outputting anything I use your workflow, Ollama version 1.30, Win 10, ComfyUI: 209296b4c7 Manager: V2.11 You mention in the note "Ollama has a bug on the latest version" I don't know if you mean a bug to output the prompt (working fine) or generating images from the prompt

Also, what are the API key errors in the console Thank you

if-ai commented 2 months ago

Sorry it might be display text node has a bug, if Ollama is working now you can connect the response directly to the conditioning node you can still display the text just don't use the output of the display node for now I will fix that next update

MMoneer commented 2 months ago

Sorry it might be display text node has a bug, if Ollama is working now you can connect the response directly to the conditioning node you can still display the text just don't use the output of the display node for now I will fix that next update

Thank You so much, it's working now

alexdoroga commented 1 month ago

I'm looking for a way to describe the image... all three models describe either a cyberdemon or a woman, regardless of the input image, llava 7b sometimes guesses what is in the picture, but almost always describes the style and scene with “Epic, ... Cyberpunk, Dynamic Angle, ... Neon City, Futuristic Clothing, Street Art, Hovering Cars” which is not there at all... How to describe the pictures correctly? Sorry for the lame questions, I'm not a very good Pythonist.

Знімок екрана 2024-05-26 114221

Знімок екрана 2024-05-26 114802

Знімок екрана 2024-05-26 114607

Ideally, I would like LLM to be able to separately describe what is in the picture, like “Teenage Boy, Riding Bicycle, on the street, blue t-shirt”, etc., and separately publish “by Hayao Miyadzaki, digital art, in ghibli style, manga”, etc...

if-ai commented 1 month ago

Yes this one I haven't seen it looks like you are doing everything correctly Maybe try the newer IF Chat node on SD mode, but you are doing everything correct there and it looks like an issue connecting with ollama maybe you update to the latest version, make sure ollama is running and try the ip number 127.0.0.1 instead of localhost

alexdoroga commented 1 month ago

Yes this one I haven't seen it looks like you are doing everything correctly Maybe try the newer IF Chat node on SD mode, but you are doing everything correct there and it looks like an issue connecting with ollama maybe you update to the latest version, make sure ollama is running and try the ip number 127.0.0.1 instead of localhost

I'm using the latest version of ollama - just installed it, and+ it's been updated recently. The “Ollama Generate” nodes work fine, but there's a lot of manual writing and very inconvenient nodes. Below is the result of what I got on your recommendation. ((

Знімок екрана 2024-05-26 123140

workflow (10).json

if-ai commented 1 month ago

yes on the screenshot you are not using llava anymore you need to use llava or other vision model to be able to see the image also clear empty the stop if you don't want keywords don't use the IFPromptIMG as it tries to add SD style keywords, please let me know if it works

alexdoroga commented 1 month ago

don't use the IFPromptIMG as it tries to add SD style keywords I'm sorry. I didn't understand how to do it here. On the screen below is a list of possible assistants. I found the SD option at the end and turned it on.

The second screenshot shows that the description did appear. I understand that i still need to use Ptompt to Prompt with the model “stable-diffusion-prompt-generator:latest” to turn it into usable prompt. I thought it was possible to get a nice prompt in one step, with one model))) Знімок екрана 2024-05-26 130958

And as for the style, I used the SDXL IP-adapter for the 10th time today and it worked out fine )))))) It turns out that you don't need to torture LLM ))) but it would be nice to get a description of the style, at least to find how the model describes it, understands it, etc.)

Thank you for your work - you have made a very cool repository ))

Greetings from Ukraine ))

if-ai commented 1 month ago

Thank you. I am glad I could help, also you can edit the profiles they are json files store on the custom_nodes/ComfyUI-IF_AI_tools folder you can make your own using IFpromptMKR_IMG as a base it can yield better results with a little editing close to what you expect, the same for the other presets you can edit then or make your own.