ShishirPatil / gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
https://gorilla.cs.berkeley.edu/
Apache License 2.0
11.46k stars 995 forks source link

[BFCL] Wether there is any semantic checker for the parameters' value? #643

Open Ghevil opened 1 month ago

Ghevil commented 1 month ago

Describe the issue When we do the evaluation, do we have some checker that checks the semantic consistency of the parameters instead of simply matching them from the candidate list?

It seems too harsh that some minor errors that lead to errors, such as San Diego and San Diego, CA are different in format, but the two are semantically consistent.

What's more, like this error "Invalid value for parameter 'items': ['pumpkin', 'egg']. Expected one of [['pumpkins', 'eggs'], ['pumpkin', 'dozen eggs']]."

Looking forward to your reply.

HuanzhiMao commented 1 month ago

Hi @Ghevil,

Unfortunately, we don't have any semantic checkers at this moment.

In addition to trying to make the candidate list as comprehensive as possible, we have also tried to be clear and pose restrictions on the parameter formats in the function documentation. For example, in the location example you provided, we would phrase it as the following so that only San Diego, CA is correct and San Diego would be wrong.

{
    "location": {
        "type": "string",
        "description": "The location in 'city, state' format."
    }
}

As another example, for things like date, there would only be one correct ground truth.

{
    "end_date": {
        "type": "string",
        "description": "The ending date until which to retrieve stock prices. Format: 'yyyy-mm-dd'."
    }
}

If you have any other good solutions, we would love to hear your thoughts.