langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
36.53k stars 4.95k forks source link

The answers to the same question are confusing #5851

Open EnzoLiang opened 5 days ago

EnzoLiang commented 5 days ago

Self Checks

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

I creat an agent and add a math tool to it, then selected qwen llm deployed by xinference for testing. When I first asked the following question, the bot experienced a hallucination, and upon clicking on the Agent log, an "provider type not found" prompt appeared in the top right corner. However, when I asked the same question again, it responded normally. lyz

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 5 days ago

@EnzoLiang Hello there! I'm here to help with bugs, answer questions, and assist you while waiting for a human maintainer. Let's squash those bugs together! 😉

The "provider type not found" error on the first attempt but normal response on the second attempt could be due to an issue with the model configuration or the implementation of the model invocation code. Specifically, it might be related to how the model type is defined and initialized in the YAML configuration and the corresponding Python code.

Ensure that the model type is correctly specified in the YAML configuration file and that the model invocation code properly handles the initialization and invocation of the model. Here is a checklist to verify:

  1. YAML Configuration: Ensure that the model type and other properties are correctly defined in the YAML file.

    model: qwen-llm  # Model identifier
    label:
      en_US: Qwen LLM
    model_type: llm  # Model type
    features:
    - agent-thought
    model_properties:
      mode: chat
      context_size: 200000
    parameter_rules:
    - name: temperature
      use_template: temperature
    - name: top_p
      use_template: top_p
    - name: top_k
      type: int
      required: false
    - name: max_tokens_to_sample
      use_template: max_tokens
      default: 4096
      min: 1
      max: 4096
    pricing:
      input: '8.00'
      output: '24.00'
      unit: '0.000001'
      currency: USD
  2. Model Invocation Code: Ensure that the model invocation code in llm.py is correctly implemented and handles both synchronous and streaming responses properly.

    def _invoke(self, model: str, credentials: dict,
                prompt_messages: list[PromptMessage], model_parameters: dict,
                tools: Optional[list[PromptMessageTool]] = None, stop: Optional[List[str]] = None,
                stream: bool = True, user: Optional[str] = None) \
            -> Union[LLMResult, Generator]:
        if stream:
            return self._handle_stream_response(model, credentials, prompt_messages, model_parameters, tools, stop, user)
        return self._handle_sync_response(model, credentials, prompt_messages, model_parameters, tools, stop, user)
    
    def _handle_stream_response(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                                model_parameters: dict, tools: Optional[list[PromptMessageTool]] = None,
                                stop: Optional[List[str]] = None, user: Optional[str] = None) -> Generator:
        for chunk in response:
            yield chunk
    
    def _handle_sync_response(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                              model_parameters: dict, tools: Optional[list[PromptMessageTool]] = None,
                              stop: Optional[List[str]] = None, user: Optional[str] = None) -> LLMResult:
        return LLMResult(**response)
  3. Error Mapping: Ensure that the error mapping is correctly implemented to handle different types of invocation errors.

    @property
    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
        return {
            InvokeConnectionError: [ConnectionError],
            InvokeServerUnavailableError: [ServerUnavailableError],
            InvokeRateLimitError: [RateLimitError],
            InvokeAuthorizationError: [AuthorizationError],
            InvokeBadRequestError: [BadRequestError]
        }

Additionally, ensure that the Xinference service is correctly deployed and accessible. Here are the steps to verify:

  1. Deploy Xinference: Ensure that Xinference is installed and started correctly.

    $ pip install "xinference[all]"
    $ xinference-local
  2. Access Endpoint: Verify that the Xinference service is running and accessible at the specified endpoint (e.g., http://127.0.0.1:9997).

  3. Network Configuration: If using Docker, ensure that the network configuration allows the Dify container to access the Xinference endpoint. The Dify container cannot access localhost inside; use the host IP address instead.

  4. Model Deployment: Ensure that the model is correctly deployed in Xinference and that you have the correct model UID.

  5. Dify Configuration: In Dify, configure the Xinference model provider with the correct model name, server URL, and model UID.

By ensuring that the model type is correctly defined, the invocation code is properly implemented, and the Xinference service is correctly deployed and accessible, you can address the "provider type not found" error and the hallucination issue [1][2].

To continue talking to Dosu, mention @dosu.

EnzoLiang commented 5 days ago

@dosu Thank you for your reply, but I don't think that's the reason. When answering other questions, it was normal. Here is a screenshot of the specific error lyz