[BFCL] Add Support for Regeneration, Specific Test Entry IDs, and Custom Directory Locations

Raymond112514 commented 2 weeks ago

This pull request introduces several improvements and new features to the CLI commands:

Regeneration Support
- Added the --allow-overwrite flag to the generation command.
- This option allows regeneration of test entries even if some entries already exist.
- The flag is only valid for the generate command.
Selective Test Entry Execution
- Introduced a new --run-ids flag:
  - When enabled, this argument reads a list of test entry IDs from the file test_case_ids_to_generate.json.
  - Only those specific test IDs will be executed, instead of the entire category.
  - This feature is also exclusive to the generate command, and cannot be used together with --test-category.
Customizable Result and Score Directories
- Added --score-dir and --result-dir options for both the generate and evaluate commands.
- These options allow users to specify custom paths for result and score directories.
- Paths should be relative to the root folder of berkeley-function-call-leaderboard.

In addition, this PR contains an update to the check_illegal_python_param_name.py script to avoid storing the functions in the multi_turn categories; it won't affect the accuracy of the dataset.

Raymond112514 commented 2 weeks ago

Added the --rerun-all flag. When this flag is present, the results are overwritten. Changed the logic of collect_test_case slightly.

Raymond112514 commented 2 weeks ago

Added the --result-dir and --score dir option.

CharlieJCJ commented 1 day ago

Testing on my side:

ShishirPatil / gorilla

[BFCL] Add Support for Regeneration, Specific Test Entry IDs, and Custom Directory Locations #743