Closed guanyu-nie closed 16 minutes ago
Could you give an example of the 'blank' function call you are referring to?
In BFCL_v3_multi_turn_miss_param.json, the possible answer for the first data point is
[["cd(folder='document')", "mkdir(dir_name='temp')", "mv(source='final_report.pdf', destination='temp')"], ["cd(folder='temp')", "grep(file_name='final_report.pdf',pattern='budget analysis')"], ["sort('final_report.pdf')"], [], ["cd(folder='..')", "mv(source='previous_report.pdf',destination='temp')", "cd(folder='temp')", "diff(file_name1='final_report.pdf',file_name2='previous_report.pdf')"]]}
where the 4th entry is the answer for missing parameter query. We will train the model to output something like:
["cd(folder='..')", "mv(source='',destination='temp')", "cd(folder='temp')", "diff(file_name1='final_report.pdf',file_name2='previous_report.pdf')"]
The second move misses a required parameter thus it will be decoded as []
in post processing.
How do you distinguish between the situation where the actual parameter value is ""
vs. the model output your placeholder format?
The details might need further considerations but will this kind of post process be allowed?
Yes. Some similar would be the NexusRaven-V2 (handler code here). They add an out_of_domain
function to each test entry, and if the model invokes that function, then they mark their model response as irrelevant
.
We recognize that in missing parameter multi-turn dataset, if there is a missing parameter round, the ground truth label is an empty list (or undecodable), which is the same as an irrelevance signal. When training, we don't want to treat this scenario as irrelevance. Instead, we'd like to still output a function call, but leave the parameter as blank. Under this situation, a potential agent module (seperated from function call module) can ask user to fill in the parameter. Under this situation, for evaluation on BFCL, we need a post process to convert this partial function call to the irrelevance signal. Would this post process be allowed?