langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
46.75k stars 6.61k forks source link

Firecrawl started throwing the error: 'Failed to invoke tool: unhashable type: 'dict'.' #8792

Open ugyuji opened 2 days ago

ugyuji commented 2 days ago

Self Checks

Dify version

0.8.3

Cloud or Self Hosted

Cloud, Self Hosted (Docker)

Steps to reproduce

  1. Create a new "Workflow"
  2. Add Firecrawl's "Crawl" tool
  3. Set a "Start URL" variable to "https://dify.ai/"
  4. Run the workflow

✔️ Expected Behavior

No response

❌ Actual Behavior

35b2061a9ead09d037198315f3bd573a
dosubot[bot] commented 2 days ago

Hey @ugyuji! I'm here to help you with any bugs, questions, or contributions you have for our project. Let's squash those bugs together! 😊

I found a similar issue that might help you resolve the error "Failed to invoke tool: unhashable type: 'dict'". The error occurs because the tool_parameters dictionary is being used in a context where a hashable type is required. In Python, dictionaries are not hashable and cannot be used as keys in other dictionaries or added to sets.

To resolve this issue, ensure that tool_parameters or any of its contents are not being used in a way that requires them to be hashable. For example, if tool_parameters is being used as a key in another dictionary, you should convert it to a hashable type, such as a tuple of key-value pairs.

Here is the relevant part of the code for reference:

class CrawlTool(BuiltinTool):
    def _invoke(self, user_id: str, tool_parameters: dict[str, Any]) -> ToolInvokeMessage:
        app = FirecrawlApp(
            api_key=self.runtime.credentials["firecrawl_api_key"], base_url=self.runtime.credentials["base_url"]
        )

        scrapeOptions = {}
        payload = {}

        wait_for_results = tool_parameters.get("wait_for_results", True)

        payload["excludePaths"] = get_array_params(tool_parameters, "excludePaths")
        payload["includePaths"] = get_array_params(tool_parameters, "includePaths")
        payload["maxDepth"] = tool_parameters.get("maxDepth")
        payload["ignoreSitemap"] = tool_parameters.get("ignoreSitemap", False)
        payload["limit"] = tool_parameters.get("limit", 5)
        payload["allowBackwardLinks"] = tool_parameters.get("allowBackwardLinks", False)
        payload["allowExternalLinks"] = tool_parameters.get("allowExternalLinks", False)
        payload["webhook"] = tool_parameters.get("webhook")

        scrapeOptions["formats"] = get_array_params(tool_parameters, "formats")
        scrapeOptions["headers"] = get_json_params(tool_parameters, "headers")
        scrapeOptions["includeTags"] = get_array_params(tool_parameters, "includeTags")
        scrapeOptions["excludeTags"] = get_array_params(tool_parameters, "excludeTags")
        scrapeOptions["onlyMainContent"] = tool_parameters.get("onlyMainContent", False)
        scrapeOptions["waitFor"] = tool_parameters.get("waitFor", 0)
        scrapeOptions = {k: v for k, v in scrapeOptions.items() if v not in (None, "")}
        payload["scrapeOptions"] = scrapeOptions or None

        payload = {k: v for k, v in payload.items() if v not in (None, "")}

        crawl_result = app.crawl_url(url=tool_parameters["url"], wait=wait_for_results, **payload)

        return self.create_json_message(crawl_result)

Ensure that tool_parameters is not being used in a way that requires it to be hashable, and the error should be resolved. For instance, the scrapeOptions and payload dictionaries are filtered to remove None or empty values before being used, which helps avoid such issues [1][2][3].

Additionally, you might find it helpful to look at similar issues and discussions:

To continue talking to Dosu, mention @dosu.

hjlarry commented 2 days ago

I can't reproduce. Can you upload your workflow DSL?

Halflifefa commented 2 days ago

The same question, the process used is the official template <Interview Preparation (Conditional Parallel)> Key obtained from https://www.firecrawl.dev/app/api-keys

hjlarry commented 2 days ago

now I reproduce it in the cloud version,
but in my local newest branch, it works, maybe some PR already fix this

ugyuji commented 2 days ago

Here's the DSL content:

app:
  description: ''
  icon: 🤖
  icon_background: '#FFEAD5'
  mode: workflow
  name: Test Workflow
  use_icon_as_answer_icon: false
kind: app
version: 0.1.2
workflow:
  conversation_variables: []
  environment_variables: []
  features:
    file_upload:
      image:
        enabled: false
        number_limits: 3
        transfer_methods:
        - local_file
        - remote_url
    opening_statement: ''
    retriever_resource:
      enabled: false
    sensitive_word_avoidance:
      enabled: false
    speech_to_text:
      enabled: false
    suggested_questions: []
    suggested_questions_after_answer:
      enabled: false
    text_to_speech:
      enabled: false
      language: ''
      voice: ''
  graph:
    edges:
    - data:
        isInIteration: false
        sourceType: start
        targetType: tool
      id: 1712630129285-source-1727334022276-target
      source: '1712630129285'
      sourceHandle: source
      target: '1727334022276'
      targetHandle: target
      type: custom
      zIndex: 0
    - data:
        isInIteration: false
        sourceType: tool
        targetType: end
      id: 1727334022276-source-1713020453724-target
      source: '1727334022276'
      sourceHandle: source
      target: '1713020453724'
      targetHandle: target
      type: custom
      zIndex: 0
    nodes:
    - data:
        desc: ''
        selected: false
        title: Start
        type: start
        variables:
        - label: url
          max_length: 256
          options: []
          required: true
          type: text-input
          variable: url
      height: 90
      id: '1712630129285'
      position:
        x: 30
        y: 427
      positionAbsolute:
        x: 30
        y: 427
      selected: true
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        outputs:
        - value_selector:
          - '1727334022276'
          - text
          variable: output
        selected: false
        title: End
        type: end
      height: 90
      id: '1713020453724'
      position:
        x: 638
        y: 427
      positionAbsolute:
        x: 638
        y: 427
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    - data:
        desc: ''
        provider_id: firecrawl
        provider_name: firecrawl
        provider_type: builtin
        selected: false
        title: Crawl
        tool_configurations:
          allowBackwardLinks: 0
          allowExternalLinks: 0
          excludePaths: null
          excludeTags: null
          formats: null
          headers: null
          ignoreSitemap: 1
          includePaths: null
          includeTags: null
          limit: 5
          maxDepth: 2
          onlyMainContent: 0
          waitFor: null
          wait_for_results: 1
          webhook: null
        tool_label: Crawl
        tool_name: crawl
        tool_parameters:
          url:
            type: mixed
            value: '{{#1712630129285.url#}}'
        type: tool
      height: 454
      id: '1727334022276'
      position:
        x: 334
        y: 427
      positionAbsolute:
        x: 334
        y: 427
      selected: false
      sourcePosition: right
      targetPosition: left
      type: custom
      width: 244
    viewport:
      x: 307.80000000000007
      y: -8.799999999999955
      zoom: 0.7

The same error can be seen in both local and cloud environments.

ujway commented 1 day ago

I have a same error with following cases:

My workaround is to use JinaReader instead at this moment.

CleanShot 2024-09-27 at 15 54 43@2x

hjlarry commented 1 day ago

this PR raise the issue https://github.com/langgenius/dify/pull/8391 and this PR fix it https://github.com/langgenius/dify/pull/8666 please wait the new version or git pull latest commit to fix this.

JustinWangJP commented 1 day ago

I also face the same problem. When will you publish a new version?