[BUG] Running Prompt flow locally produces errors

rlerch commented 1 week ago

Describe the bug Running a flow using pf test seems to work but exceptions are reported locally.

How To Reproduce the bug Run pf test on a flow locally. The flow executes successfully but exceptions are generated when collecting token metrics for openai.

Expected behavior A clean run without exceptions

Screenshots

WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'computed.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'llm.usage.completion_tokens_details' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types WARNING:opentelemetry.attributes:Invalid type NoneType for attribute 'computed.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types

Running Information(please complete the following information): { "promptflow": "1.15.0", "promptflow-azure": "1.15.0", "promptflow-core": "1.15.0", "promptflow-devkit": "1.15.0", "promptflow-tracing": "1.15.0" }

Executable 'c:\git\azure-ai-prompt-flow.venv\Scripts\python.exe' Python (Windows) 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] Additional context Seems like the exception is in here

    def collect_openai_tokens_for_parent_span(self, span):
        tokens = self.try_get_openai_tokens(span.get_span_context().span_id)
        if tokens:
            if not hasattr(span, "parent") or span.parent is None:
                return
            parent_span_id = span.parent.span_id
            with self._lock:
                if parent_span_id in self._span_id_to_tokens:
                    merged_tokens = {
                        key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
                        for key in set(self._span_id_to_tokens[parent_span_id]) | set(tokens)
                    }
                    self._span_id_to_tokens[parent_span_id] = merged_tokens
                else:
                    self._span_id_to_tokens[parent_span_id] = tokens

On the line key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0) . Not all steps in my flow are LLM steps. I wonder if this could be causing the issue where we have some steps in the flow that don't have any tokens.

berndku commented 1 week ago

We see something similar, however in our case it is an error and failing the execution:

File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 144, in <dictcomp>
key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

jwieler commented 1 week ago

Seeing similar errors on my end, this is halting flow execution:

2024-09-13 14:04:35 -0400 46456 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'.

mack-adknown commented 1 week ago

Same issue, resulting in flow being terminated: WARNING:opentelemetry.attributes:Invalid type dict for attribute '__computed__.cumulative_token_count.completion' value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types 2024-09-13 16:53:20 -0400 28812 execution.flow WARNING Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'. 2024-09-13 16:53:20 -0400 28812 execution.flow ERROR Flow execution has failed. Cancelling all running nodes: extract_data.

SULAPIS commented 1 week ago

Some keys in the tokens have values that are not int, causing the issue.

azure_open_ai: tokens = {'completion_tokens': 13, 'prompt_tokens': 673, 'total_tokens': 686, 'completion_tokens_details': None} open_ai: {'completion_tokens': 13, 'prompt_tokens': 669, 'total_tokens': 682, 'completion_tokens_details': {'reasoning_tokens': 0}}

As a temporary solution, tracing can be disabled by setting PF_DISABLE_TRACING=true.

asos-oliverfrost commented 1 week ago

Rolling back the version of openai also works, openai<=1.44.1 resolves the error. Looks like promptflow-tracing could be incompatible with 1.45.0 https://pypi.org/project/openai/#history

jomalsan commented 1 week ago

@asos-oliverfrost thank you for finding that! Rolling back worked for me. It looks like this is the specific line that is breaking the behavior in prompt flow, because the new field completion_token_details is optional: https://github.com/openai/openai-python/compare/v1.44.1...v1.45.0#diff-d85f41ac9f419751206af46c34ef5c8c74258660be492aa703dcbebcfc96a41bR25

olopezqubika commented 3 days ago

Rolling back OpenAI did not work for me, Pydantic is so dynamic that I still get completion_tokens_details={'reasoning_tokens': 0}. In summary:

An OpenAI endpoint returns a dictionary inside another dictionary: {"completion_token_details":{"foo_bar":0}}.
Then it's processed by promflow.tracing, which does not support that.

cfoster0 commented 3 days ago

Also having this issue, with similar errors:

2024-09-19 18:05:38 -0700   89862 execution          ERROR    Node extract_result in line 0 failed. Exception: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'.
Traceback (most recent call last):
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 182, in _invoke_tool_inner
    return f(**kwargs)
           ^^^^^^^^^^^
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 561, in wrapped
    token_collector.collect_openai_tokens_for_parent_span(span)
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 143, in collect_openai_tokens_for_parent_span
    merged_tokens = {
                    ^
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/tracing/_trace.py", line 144, in <dictcomp>
    key: self._span_id_to_tokens[parent_span_id].get(key, 0) + tokens.get(key, 0)
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 90, in invoke_tool
    result = self._invoke_tool_inner(node, f, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/[...]/pypoetry/virtualenvs/promptflow-test-gto235yr-py3.11/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py", line 206, in _invoke_tool_inner
    raise ToolExecutionError(node_name=node_name, module=module) from e
promptflow._core._errors.ToolExecutionError: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'
2024-09-19 18:05:38 -0700   89862 execution.flow     WARNING  Failed to calculate metrics due to exception: unsupported operand type(s) for +: 'int' and 'dict'.
2024-09-19 18:05:38 -0700   89862 execution.flow     ERROR    Flow execution has failed. Cancelling all running nodes: extract_result.
pf.flow.test failed with UserErrorException: TypeError: Execution failure in 'extract_result': (TypeError) unsupported operand type(s) for +: 'dict' and 'dict'

This happens when running a version of the chat-math-variant example, edited so that extract_text.py calls the OpenAI ChatCompletions endpoint. One of my coworkers is also seeing this error, seemingly from an LLM tool call.

microsoft / promptflow

[BUG] Running Prompt flow locally produces errors #3751