[BUG] promptflow built in evaluator cannot take nested json outputs as column mapping

yanggaome commented 3 months ago

Describe the bug A clear and concise description of the bug.

I am using promptflow evaluate API where I provide a target call

def target_call():
  ...
  return  {"outputs": {"outputA": "A", "outputB": "B"}}

evaluate(
  target = target_call,
  evaluators = {"content_safety": ContentSafetyEvaluator()},
  evaluator_config = {
            "content_safety": {"question": "${data.query}", "answer": "${target.outputs.outputA}"}
            },
)

using target.outputs.outputA does not work, it will complain cannot find "answer" I can only use target.outputs. I understand this may not be a bug, it may be a feature request

Is there a way to support this?

  File "/anaconda/envs/azureml_py38/lib/python3.9/site-packages/promptflow/evals/evaluate/_evaluate.py", line 425, in _evaluate
    _validate_columns(input_data_df, evaluators, target=None, evaluator_config=evaluator_config)
  File "/anaconda/envs/azureml_py38/lib/python3.9/site-packages/promptflow/evals/evaluate/_evaluate.py", line 153, in _validate_columns
    _validate_input_data_for_evaluator(evaluator, evaluator_name, new_df)
  File "/anaconda/envs/azureml_py38/lib/python3.9/site-packages/promptflow/evals/evaluate/_evaluate.py", line 81, in _validate_input_data_for_evaluator
    raise ValueError(f"Missing required inputs for evaluator {evaluator_name} : {missing_inputs}.")
ValueError: Missing required inputs for evaluator content_safety : ['answer'].

How To Reproduce the bug Steps to reproduce the behavior, how frequent can you experience the bug: 1.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Running Information(please complete the following information):

Promptflow Package Version using pf -v: [e.g. 0.0.102309906]
Operating System: [e.g. Ubuntu 20.04, Windows 11]
Python Version using python --version: [e.g. python==3.10.12]

{ "promptflow": "1.13.0", "promptflow-azure": "1.13.0", "promptflow-core": "1.13.0", "promptflow-devkit": "1.13.0", "promptflow-evals": "0.3.1", "promptflow-tracing": "1.13.0" }

Executable '/anaconda/envs/azureml_py38/bin/python' Python (Linux) 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21) [GCC 12.3.0]

Additional context Add any other context about the problem here.

jomalsan commented 2 months ago

+1 on the ability to pass nested fields to any flow as a column mapping. I believe the general case is directly tied to this bug, but I am happy to open an independent issue if preferred.

Say my input data looks like:

{
    "foo": {
        "bar": "baz"
    }
}

Then I would love to be able to pass a column mapping of:

column_mapping:
  input: ${data.foo.bar}

The reason I don't want to pass ${data.foo} and then parse out "bar" in my own code is:

It causes a loss of typing information for inference
Inference consumers then would need to pass a formed "foo" object as "input" instead of just the input string
Dict inputs look ugly in the tracing UI and makes it so every point has to be clicked on to see the input

luigiw commented 2 months ago

Thx for the feedback! We need a standard way to allow complex json mapping. Something like json path. If you have a suggestion of an industry standard, definitely let us know. We’ll estimate and come back to this.

mallapraveen commented 3 weeks ago

do we have a solution or work around for this ?

microsoft / promptflow

[BUG] promptflow built in evaluator cannot take nested json outputs as column mapping #3604