Closed czslxk closed 5 months ago
Hello @czslxk! I'm here to help you with bugs, questions, and becoming a contributor. Let's tackle this issue together!
To resolve the issue where the Stable Diffusion WebUI node in your workflow outputs NULL instead of a picture or a base64 string of the picture, ensure that you have correctly configured the required credentials for the Stable Diffusion provider. Specifically, you need to provide the base_url
and model
parameters correctly.
Here is the relevant configuration snippet:
credentials_for_provider:
base_url:
type: secret-input
required: true
label:
en_US: Base URL
placeholder:
en_US: Please input your StableDiffusion server's Base URL
model:
type: text-input
required: true
label:
en_US: Model
placeholder:
en_US: Please input your model
help:
en_US: The model name of the StableDiffusion server
Make sure you have:
Additionally, follow these steps to ensure your Stable Diffusion WebUI is properly set up:
cd stable-diffusion-webui
./webui.bat --api --listen
cd stable-diffusion-webui
./webui.sh --api --listen
models
directory of the Stable Diffusion WebUI.http://your_id:port/sdapi/v1/sd-models
to retrieve the model name.Ensure that the base_url
and model
parameters in your configuration match the setup of your Stable Diffusion WebUI. If these credentials are not set correctly, the node may not be able to communicate with the server properly, resulting in a NULL output.
Additionally, ensure that the send_images
parameter is set to True
in your tool parameters. This parameter is crucial for the output to include the generated images:
DRAW_TEXT_OPTIONS = {
...
"send_images": True,
...
}
If the send_images
parameter is not set correctly, the output might not include the generated images, leading to a NULL result.
Moreover, ensure that other parameters such as prompt
, steps
, width
, height
, and negative_prompt
are correctly set as per your requirements. Here is a snippet of the parameters configuration:
parameters:
- name: prompt
type: string
required: true
label:
en_US: Prompt
human_description:
en_US: Image prompt, you can check the official documentation of Stable Diffusion
llm_description: Image prompt of Stable Diffusion, you should describe the image you want to generate as a list of words as possible as detailed, the prompt must be written in English.
form: llm
- name: model
type: string
required: false
label:
en_US: Model Name
human_description:
en_US: Model Name
form: form
- name: lora
type: string
required: false
label:
en_US: Lora
human_description:
en_US: Lora
form: form
default: ""
- name: steps
type: number
required: false
label:
en_US: Steps
human_description:
en_US: Steps
form: form
default: 10
- name: width
type: number
required: false
label:
en_US: Width
human_description:
en_US: Width
form: form
default: 1024
- name: height
type: number
required: false
label:
en_US: Height
human_description:
en_US: Height
form: form
default: 1024
- name: negative_prompt
type: string
required: false
label:
en_US: Negative prompt
human_description:
en_US: Negative prompt
form: form
default: bad art, ugly, deformed, watermark, duplicated, discontinuous lines
By ensuring all these configurations and parameters are correctly set, you should be able to resolve the issue of the Stable Diffusion WebUI node outputting NULL [1][2][3][4].
To continue talking to Dosu, mention @dosu.
Hello @czslxk! I'm here to help you with bugs, questions, and becoming a contributor. Let's tackle this issue together!
To resolve the issue where the Stable Diffusion WebUI node in your workflow outputs NULL instead of a picture or a base64 string of the picture, ensure that you have correctly configured the required credentials for the Stable Diffusion provider. Specifically, you need to provide the
base_url
andmodel
parameters correctly.Here is the relevant configuration snippet:
credentials_for_provider: base_url: type: secret-input required: true label: en_US: Base URL placeholder: en_US: Please input your StableDiffusion server's Base URL model: type: text-input required: true label: en_US: Model placeholder: en_US: Please input your model help: en_US: The model name of the StableDiffusion server
Make sure you have:
- Entered the correct Base URL of your Stable Diffusion server.
- Specified the correct model name used by your Stable Diffusion server.
Additionally, follow these steps to ensure your Stable Diffusion WebUI is properly set up:
- Make sure you have a machine with a GPU: While not necessary, using a GPU will significantly speed up image generation.
Launch Stable Diffusion WebUI:
- Clone the Stable Diffusion WebUI repository from the official repository.
Launch it locally using the appropriate command for your operating system:
- Windows:
cd stable-diffusion-webui ./webui.bat --api --listen
- Linux:
cd stable-diffusion-webui ./webui.sh --api --listen
- Prepare Models: Download the models from HuggingFace or other sources and place them in the
models
directory of the Stable Diffusion WebUI.- Get Model Name: Access the model list at
http://your_id:port/sdapi/v1/sd-models
to retrieve the model name.Ensure that the
base_url
andmodel
parameters in your configuration match the setup of your Stable Diffusion WebUI. If these credentials are not set correctly, the node may not be able to communicate with the server properly, resulting in a NULL output.Additionally, ensure that the
send_images
parameter is set toTrue
in your tool parameters. This parameter is crucial for the output to include the generated images:DRAW_TEXT_OPTIONS = { ... "send_images": True, ... }
If the
send_images
parameter is not set correctly, the output might not include the generated images, leading to a NULL result.Moreover, ensure that other parameters such as
prompt
,steps
,width
,height
, andnegative_prompt
are correctly set as per your requirements. Here is a snippet of the parameters configuration:parameters: - name: prompt type: string required: true label: en_US: Prompt human_description: en_US: Image prompt, you can check the official documentation of Stable Diffusion llm_description: Image prompt of Stable Diffusion, you should describe the image you want to generate as a list of words as possible as detailed, the prompt must be written in English. form: llm - name: model type: string required: false label: en_US: Model Name human_description: en_US: Model Name form: form - name: lora type: string required: false label: en_US: Lora human_description: en_US: Lora form: form default: "" - name: steps type: number required: false label: en_US: Steps human_description: en_US: Steps form: form default: 10 - name: width type: number required: false label: en_US: Width human_description: en_US: Width form: form default: 1024 - name: height type: number required: false label: en_US: Height human_description: en_US: Height form: form default: 1024 - name: negative_prompt type: string required: false label: en_US: Negative prompt human_description: en_US: Negative prompt form: form default: bad art, ugly, deformed, watermark, duplicated, discontinuous lines
By ensuring all these configurations and parameters are correctly set, you should be able to resolve the issue of the Stable Diffusion WebUI node outputting NULL [1][2][3][4].
To continue talking to Dosu, mention @dosu.
I think I set all you mentioned correctly because I can use SD as a tool in the Agent function, but only once (It may because the setting of the Agent). So I think the base url, model name and send_images parameter are all right. Here is one screenshot when I use Agent.
If you have confirmed that the base_url
, model
, and send_images
parameters are correctly set, and you can use Stable Diffusion as a tool in the Agent function, the issue might be related to the specific configuration or parameters in your workflow.
Here are a few additional checks and steps you can take:
Prompt Parameter: Ensure that the prompt
parameter is correctly set and not empty. The prompt
is mandatory and should be a detailed description of the image you want to generate.
Check for Errors: Look for any error messages or logs that might provide more insight into why the output is NULL. This can help identify if there are any issues with the request or response.
Validate Credentials: Even though the credentials work in the Agent function, double-check that they are correctly configured in the workflow context. The StableDiffusionProvider
class includes a method _validate_credentials
that validates the provided credentials. Ensure that your credentials pass this validation step.
Workflow Configuration: Verify that all other parameters such as steps
, width
, height
, and negative_prompt
are correctly set as per your requirements. Incorrect or missing parameters might cause the node to fail silently.
Network Issues: Ensure there are no network issues or firewalls blocking the communication between your workflow and the Stable Diffusion server.
If all these checks are in place and the issue persists, it might be helpful to review the specific configuration and parameters used in your workflow compared to the Agent function where it works. This can help identify any discrepancies or additional settings that might be required in the workflow context.
To continue talking to Dosu, mention @dosu.
After finish the checks and steps you mentioned, I found the key point is the "Agent Mode" in "Agent Setting". When I use llama3, the key of "Agent Mode" is "ReAct". But when I use other model, such as Moonshot AI and GPT 3.5, the key is "Function Calling".
"ReAct" may leads to dead loop, which means the agent will communicate with itself and won't stop. But the "Function Calling" is more like what I want, just call function when needs.
So I have some questions as below:
What's the difference between these two modes? And what kind of task are they design for?
Could we change the mode for each model? For example, I want to change the "Agent Mode" from "ReAct" to "Function Calling", how could I do that?
The difference between the "ReAct" and "Function Calling" modes lies in their approach to selecting and querying knowledge bases:
Yes, it is possible to change the mode for each model. You can set the type of inference mode for an agent, choosing between ReAct and Function Calling modes. This flexibility allows you to tailor the agent's behavior to better suit your application's needs. The mode can be set in the model configuration, and the system will adapt its behavior accordingly [1][2][3].
To continue talking to Dosu, mention @dosu.
The difference between the "ReAct" and "Function Calling" modes lies in their approach to selecting and querying knowledge bases:
- ReAct Mode: This mode involves reasoning and acting based on the user query. The system model uses its reasoning capabilities to determine the most relevant knowledge base to query. It is designed for tasks that require more complex reasoning and decision-making processes.
- Function Calling Mode: In this mode, each knowledge base acts as a tool function. The system model autonomously selects the most relevant knowledge base to query based on the semantic match between the user question and the knowledge base description. This mode is more straightforward and is designed for tasks that involve direct function calls without extensive reasoning.
Yes, it is possible to change the mode for each model. You can set the type of inference mode for an agent, choosing between ReAct and Function Calling modes. This flexibility allows you to tailor the agent's behavior to better suit your application's needs. The mode can be set in the model configuration, and the system will adapt its behavior accordingly [1][2][3].
To continue talking to Dosu, mention @dosu.
But I cannot change the mode in Web page.
To change the "Agent Mode" from "ReAct" to "Function Calling" in the web interface, you need to toggle the isFunctionCall
property. This can be done by updating the tempPayload
state within the AgentSetting
component.
Here is the relevant part of the code where the agent mode is displayed:
<ItemPanel
className='mb-4'
icon={
<CuteRobote className='w-4 h-4 text-indigo-600' />
}
name={t('appDebug.agent.agentMode')}
description={t('appDebug.agent.agentModeDes')}
>
<div className='leading-[18px] text-[13px] font-medium text-gray-900'>
{isFunctionCall ? t('appDebug.agent.agentModeType.functionCall') : t('appDebug.agent.agentModeType.ReACT')}
</div>
</ItemPanel>
To change the mode, you would need to update the isFunctionCall
state. This can be done by adding a toggle mechanism, such as a button or a switch, that updates the tempPayload
state accordingly.
Here is an example of how you might add a button to toggle the agent mode:
<ItemPanel
className='mb-4'
icon={
<CuteRobote className='w-4 h-4 text-indigo-600' />
}
name={t('appDebug.agent.agentMode')}
description={t('appDebug.agent.agentModeDes')}
>
<div className='leading-[18px] text-[13px] font-medium text-gray-900'>
{isFunctionCall ? t('appDebug.agent.agentModeType.functionCall') : t('appDebug.agent.agentModeType.ReACT')}
</div>
<Button
onClick={() => setTempPayload({ ...tempPayload, isFunctionCall: !isFunctionCall })}
>
{isFunctionCall ? t('Switch to ReAct') : t('Switch to Function Calling')}
</Button>
</ItemPanel>
This button will toggle the isFunctionCall
state between true
and false
, effectively changing the agent mode.
The descriptions for the agent modes are as follows:
These descriptions can be found in the web/i18n/zh-Hans/app-debug.ts
file under the agentModeType
object [1][2][3][4].
To continue talking to Dosu, mention @dosu.
Why does llama3 default to ReAct instead of Function Call in Agent?
Llama3 defaults to "ReAct" mode instead of "Function Calling" in the Agent settings because the isFunctionCall
property is set to false
by default. This is indicated by the following line in the code:
<div className='leading-[18px] text-[13px] font-medium text-gray-900'>{isFunctionCall ? t('appDebug.agent.agentModeType.functionCall') : t('appDebug.agent.agentModeType.ReACT')}</div>
To change the default mode for llama3 to "Function Calling", you need to set the isFunctionCall
property to true
when initializing the AgentSetting
component. Here is an example:
<AgentSetting
isChatModel={true} // or false, depending on your use case
payload={yourPayload}
isFunctionCall={true} // Set this to true to default to "Function Calling"
onCancel={yourOnCancelFunction}
onSave={yourOnSaveFunction}
/>
By setting isFunctionCall
to true
, the default mode will be "Function Calling" [1][2][3].
To continue talking to Dosu, mention @dosu.
Why does llama3 default to ReAct instead of Function Call in Agent?
It could be, but we might need the community's help to do it, because we are in full capacity now.
To change the "Agent Mode" from "ReAct" to "Function Calling" in the web interface, you need to toggle the
isFunctionCall
property. This can be done by updating thetempPayload
state within theAgentSetting
component.Here is the relevant part of the code where the agent mode is displayed:
<ItemPanel className='mb-4' icon={ <CuteRobote className='w-4 h-4 text-indigo-600' /> } name={t('appDebug.agent.agentMode')} description={t('appDebug.agent.agentModeDes')} > <div className='leading-[18px] text-[13px] font-medium text-gray-900'> {isFunctionCall ? t('appDebug.agent.agentModeType.functionCall') : t('appDebug.agent.agentModeType.ReACT')} </div> </ItemPanel>
To change the mode, you would need to update the
isFunctionCall
state. This can be done by adding a toggle mechanism, such as a button or a switch, that updates thetempPayload
state accordingly.Here is an example of how you might add a button to toggle the agent mode:
<ItemPanel className='mb-4' icon={ <CuteRobote className='w-4 h-4 text-indigo-600' /> } name={t('appDebug.agent.agentMode')} description={t('appDebug.agent.agentModeDes')} > <div className='leading-[18px] text-[13px] font-medium text-gray-900'> {isFunctionCall ? t('appDebug.agent.agentModeType.functionCall') : t('appDebug.agent.agentModeType.ReACT')} </div> <Button onClick={() => setTempPayload({ ...tempPayload, isFunctionCall: !isFunctionCall })} > {isFunctionCall ? t('Switch to ReAct') : t('Switch to Function Calling')} </Button> </ItemPanel>
This button will toggle the
isFunctionCall
state betweentrue
andfalse
, effectively changing the agent mode.The descriptions for the agent modes are as follows:
- ReAct: "ReAct"
- Function Calling: "Function Calling"
These descriptions can be found in the
web/i18n/zh-Hans/app-debug.ts
file under theagentModeType
object [1][2][3][4].To continue talking to Dosu, mention @dosu.
which docker include this file and how to find this file and modify this file?
Refere
If I use docker run dify , which image should I modify and where I can find this file?
Self Checks
Dify version
0.6.10
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
I added SD in a basic workflow, and give a input "a dog". But it cannot give me any result, which means the "text" in the output of SD is empty. Here is the screenshot.
✔️ Expected Behavior
Output a picture or a base64 string of the picture.
❌ Actual Behavior
Output NULL.