langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.26k stars 6.35k forks source link

The tools parameter was not used when calculating the tokens for the message. #6887

Open 10YearsDiary opened 1 month ago

10YearsDiary commented 1 month ago

Self Checks

Dify version

0.6.15

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

In the methods for calculating model tokens (get_llm_num_tokens, get_num_tokens), the tools parameter is defined. However, when these two methods are called in base_app_runner and agent_history_prompt_transform, the tools parameter is never passed.

For example, in the following files, the get_llm_num_tokens and get_num_tokens methods are defined to calculate the tokens in the prompt. According to the method definition, if tools can be used in the message, the tools parameter needs to be passed to calculate the total number of tokens for both prompt and tools.

/api/core/model_manager.py
model_manager.py (line 136)

    def get_llm_num_tokens(self, prompt_messages: list[PromptMessage],
                           tools: Optional[list[PromptMessageTool]] = None) -> int:

api/core/model_runtime/model_providers/__base/large_language_model.py (line 475)

    @abstractmethod
    def get_num_tokens(self, model: str, credentials: dict, prompt_messages: list[PromptMessage],
                       tools: Optional[list[PromptMessageTool]] = None) -> int:

✔️ Expected Behavior

We believe that when calling the methods get_llm_num_tokens and get_num_tokens, the tools parameter should be passed.

api/core/app/apps/base_app_runner.py (line 79 ~ 81)
        prompt_tokens = model_instance.get_llm_num_tokens(
            prompt_messages,
            tools
        )

api/core/app/apps/base_app_runner.py (line 113 ~ 115)
        prompt_tokens = model_instance.get_llm_num_tokens(
            prompt_messages,
            tools
        )

api/core/prompt/agent_history_prompt_transform.py (line 48 ~ 50)
        curr_message_tokens = model_type_instance.get_num_tokens(
            self.memory.model_instance.model,
            self.memory.model_instance.credentials,
            self.history_messages,
            tools
        )

❌ Actual Behavior

However, in the following calls, we never pass the tools parameter. This will result in the tokens from tools not being correctly calculated.

api/core/app/apps/base_app_runner.py (line 79 ~ 81)
        prompt_tokens = model_instance.get_llm_num_tokens(
            prompt_messages
        )

api/core/app/apps/base_app_runner.py (line 113 ~ 115)
        prompt_tokens = model_instance.get_llm_num_tokens(
            prompt_messages
        )

api/core/prompt/agent_history_prompt_transform.py (line 48 ~ 50)
        curr_message_tokens = model_type_instance.get_num_tokens(
            self.memory.model_instance.model,
            self.memory.model_instance.credentials,
            self.history_messages
        )
dosubot[bot] commented 1 month ago

To address the issue of the tools parameter not being passed when calculating tokens in the get_llm_num_tokens and get_num_tokens methods, you need to ensure that the tools parameter is included in the method calls. Here are the necessary changes:

  1. In base_app_runner.py: Update the calls to get_llm_num_tokens to include the tools parameter.

    # Line 79 ~ 81
    prompt_tokens = model_instance.get_llm_num_tokens(
       prompt_messages,
       tools  # Add this line
    )
    
    # Line 113 ~ 115
    prompt_tokens = model_instance.get_llm_num_tokens(
       prompt_messages,
       tools  # Add this line
    )
  2. In agent_history_prompt_transform.py: Update the call to get_num_tokens to include the tools parameter.

    # Line 48 ~ 50
    curr_message_tokens = model_type_instance.get_num_tokens(
       self.memory.model_instance.model,
       self.memory.model_instance.credentials,
       self.history_messages,
       tools  # Add this line
    )

These changes ensure that the tools parameter is correctly passed and used in the token calculation methods, allowing for accurate calculation of the total number of tokens for both prompt and tools [1][2].

To continue talking to Dosu, mention @dosu.