langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.8k stars 7.3k forks source link

stable difussion can't generate correct picture in agent with llama model #8316

Closed marco-qu closed 2 weeks ago

marco-qu commented 1 month ago

Self Checks

Dify version

0.8

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

SD & DIFY docker configured in local computer. create a new agent using stable difussion tools, want to generate picture according by user input text. Although created a picture successfully, there is error occured at the same time, some information as below:

a93b8a10c4d0f594e92c0ba4096c6b6 2c58499d5fcbd5444d8287bfabb29d8

when 1 picture created, the program crached, exit and refresh the page, I can see the picture as above.

✔️ Expected Behavior

get correct picture and program no crach

❌ Actual Behavior

program crached and need reopen the page.

dosubot[bot] commented 1 month ago

Hey @marco-qu! 👋 I'm here to help you with bug fixes, answer questions, and even guide you on becoming a contributor. Let's squash those bugs together!

I found two similar unsolved discussions that might be relevant to your issue:

  1. stable diffusion text2image [1].
  2. 调用stable diffusion webui工具进行绘图失败“Failed to generate image” [2].

The error message "tool_parameters should be a dict, but got a string" occurs because the tool_parameters argument passed to the _invoke method of the StableDiffusionTool class is expected to be a dictionary, but a string is being provided instead.

In the provided code, the _invoke method is defined as follows:

def _invoke(
    self, user_id: str, tool_parameters: dict[str, Any]
) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
    headers = {
        "accept": "application/json",
        "content-type": "application/json",
        "authorization": f"Bearer {self.runtime.credentials['siliconFlow_api_key']}",
    }

    model = tool_parameters.get("model", "sd_3")
    url = SDURL.get(model)

    payload = {
        "prompt": tool_parameters.get("prompt"),
        "negative_prompt": tool_parameters.get("negative_prompt", ""),
        "image_size": tool_parameters.get("image_size", "1024x1024"),
        "batch_size": tool_parameters.get("batch_size", 1),
        "seed": tool_parameters.get("seed"),
        "guidance_scale": tool_parameters.get("guidance_scale", 7.5),
        "num_inference_steps": tool_parameters.get("num_inference_steps", 20),
    }

    response = requests.post(url, json=payload, headers=headers)
    if response.status_code != 200:
        return self.create_text_message(f"Got Error Response:{response.text}")

    res = response.json()
    result = [self.create_json_message(res)]
    for image in res.get("images", []):
        result.append(self.create_image_message(image=image.get("url"), save_as=self.VariableKey.IMAGE.value))
    return result

The tool_parameters argument is explicitly typed as a dictionary (dict[str, Any]). If a string is passed instead, it will cause the method to fail with the mentioned error.

To resolve this issue, ensure that when invoking the StableDiffusionTool, the tool_parameters argument is a dictionary. For example:

tool_parameters = {
    "prompt": "a playful beagle with big brown eyes and floppy ears. It's sitting next to a small pond, surrounded by lily pads and water plants.",
    "image_size": "1024x1024",
    "batch_size": 1,
    "model": "llama",
    "negative_prompt": "",
    "seed": None,
    "guidance_scale": 7.5,
    "num_inference_steps": 20
}

Then, call the _invoke method with the correct type:

stable_diffusion_tool._invoke(user_id="some_user_id", tool_parameters=tool_parameters)

Make sure that the tool_parameters is always a dictionary when calling the _invoke method to avoid this error [3][4][5].

To continue talking to Dosu, mention @dosu.

marco-qu commented 1 month ago

actually, I only input 1 time "a cute dog" just as above image, but it call the SD more than 1 time, and when it created the picture successfully, the process didn't exit, it still call the SD second time and the parameters are not corrected, and the process crashed. image

danni-cool commented 3 days ago

Same issue, and more worse, the stalbe diffusion tool started a generate task but with no output image