HowieHwong / MetaTool

[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
MIT License
70 stars 8 forks source link

Tool usage awareness #4

Open cp-jose opened 10 months ago

cp-jose commented 10 months ago

For this task (2.3.1 Awareness of tool usage), whose results are shown on table 3. (section 3.2), the positive samples - where the LLM needs to use a tool to address the user query - are drawn from the the generated single tool user queries (file dataset/data/all_clean_data.csv, with 20,615 queries), and the negative samples - where the LLM does not need a tool - are said to be drawn from three recent instruction datasets.

From the writing of the experimental section it seems the test set (used to produce table 3) has a 50%/50% split of positive/negative samples. I have a few questions about this:

  1. Is this proportion correct?
  2. Exactly how many queries are in used in this experiment?
  3. Are the negative samples made available on the repository? If so where? (couldn't find them)
HowieHwong commented 10 months ago

For this task (2.3.1 Awareness of tool usage), whose results are shown on table 3. (section 3.2), the positive samples - where the LLM needs to use a tool to address the user query - are drawn from the the generated single tool user queries (file dataset/data/all_clean_data.csv, with 20,615 queries), and the negative samples - where the LLM does not need a tool - are said to be drawn from three recent instruction datasets.

From the writing of the experimental section it seems the test set (used to produce table 3) has a 50%/50% split of positive/negative samples. I have a few questions about this:

  1. Is this proportion correct?
  2. Exactly how many queries are in used in this experiment?
  3. Are the negative samples made available on the repository? If so where? (couldn't find them)

Sorry for the confusion. The proportion is correct. We totally use 515 x 2 samples.

"Are the negative samples made available on the repository?": I'm sorry for that. I just found myself forgetting to upload it. I will upload it soon.

cp-jose commented 10 months ago

OK, thank you. How soon do you expect to have this fixed? I've pointed out a number of mistakes already, but nothing changed on the repository. It's still 3 months old.

HowieHwong commented 10 months ago

OK, thank you. How soon do you expect to have this fixed? I've pointed out a number of mistakes already, but nothing changed on the repository. It's still 3 months old.

Hi,

I have uploaded the datasets in dataset/tmp_dataset. You can check about it.