ShishirPatil / gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
https://gorilla.cs.berkeley.edu/
Apache License 2.0
11.47k stars 997 forks source link

[BFCL] Missing parameter post process #685

Closed guanyu-nie closed 16 minutes ago

guanyu-nie commented 1 month ago

We recognize that in missing parameter multi-turn dataset, if there is a missing parameter round, the ground truth label is an empty list (or undecodable), which is the same as an irrelevance signal. When training, we don't want to treat this scenario as irrelevance. Instead, we'd like to still output a function call, but leave the parameter as blank. Under this situation, a potential agent module (seperated from function call module) can ask user to fill in the parameter. Under this situation, for evaluation on BFCL, we need a post process to convert this partial function call to the irrelevance signal. Would this post process be allowed?

HuanzhiMao commented 1 month ago

Could you give an example of the 'blank' function call you are referring to?

guanyu-nie commented 1 month ago

In BFCL_v3_multi_turn_miss_param.json, the possible answer for the first data point is [["cd(folder='document')", "mkdir(dir_name='temp')", "mv(source='final_report.pdf', destination='temp')"], ["cd(folder='temp')", "grep(file_name='final_report.pdf',pattern='budget analysis')"], ["sort('final_report.pdf')"], [], ["cd(folder='..')", "mv(source='previous_report.pdf',destination='temp')", "cd(folder='temp')", "diff(file_name1='final_report.pdf',file_name2='previous_report.pdf')"]]} where the 4th entry is the answer for missing parameter query. We will train the model to output something like: ["cd(folder='..')", "mv(source='',destination='temp')", "cd(folder='temp')", "diff(file_name1='final_report.pdf',file_name2='previous_report.pdf')"] The second move misses a required parameter thus it will be decoded as [] in post processing.

HuanzhiMao commented 1 month ago

How do you distinguish between the situation where the actual parameter value is "" vs. the model output your placeholder format?

guanyu-nie commented 1 month ago

The details might need further considerations but will this kind of post process be allowed?

HuanzhiMao commented 1 month ago

Yes. Some similar would be the NexusRaven-V2 (handler code here). They add an out_of_domain function to each test entry, and if the model invokes that function, then they mark their model response as irrelevant.