I am interested in the CodeAct framework and would like to reproduce Table 2 in the original paper, the results of atomic API call correctness on API-Bank.
I saw that experiments were conducted on Mistral-7B-Instruct-v0.1. I followed the instructions for API-Bank evaluation preparation in the repo. However, as far as I know, all LLMs developed by Mistral do not support system prompts. Directly running the run.sh evaluation script would lead to an error, stating the incorrect formatting of the messages.
I am wondering how to obtain values in the table for Mistral-7B-Instruct-v0.1.
Thanks for your great work!
I am interested in the CodeAct framework and would like to reproduce Table 2 in the original paper, the results of atomic API call correctness on API-Bank.
I saw that experiments were conducted on Mistral-7B-Instruct-v0.1. I followed the instructions for API-Bank evaluation preparation in the repo. However, as far as I know, all LLMs developed by Mistral do not support system prompts. Directly running the
run.sh
evaluation script would lead to an error, stating the incorrect formatting of the messages.I am wondering how to obtain values in the table for Mistral-7B-Instruct-v0.1.