QueryHistoryToday description mismatch

Lucaweihs commented 3 months ago

Issue

The description of the QueryHistoryToday API in code ("This API queries the history of the given date.") does not match the description present in the jsonl files QueryHistoryToday-level-3-2.jsonl and QueryHistoryToday-level-3-3.jsonl ("This API queries the history of a given user today."). This means seems to mean you are guaranteed to fail on these examples given the strict == check in the check_api_call_correctness method of tool_search.py.

Suggested fixes

Change
```
response['output'] == groundtruth['output']
```
to
```
response['api_name'] == groundtruth['api_name']
```
as we only really care that it found the correct API and know what the description of the API is.
Update the jsonl files to use the correct description.

Lucaweihs commented 3 months ago

Also, I believe SymptomSearch-AppointmentRegistration-level-2-1.jsonl is invalid as the API call is simply stated in text

{"role": "AI", "text": "Great. Let me register your account now. [RegisterUser(username='user4', password='password4', email='user4@example.com')]"}

rather than being properly executed using the API role.

Lucaweihs commented 3 months ago

More issues:

There are unexpected case changes between the saved jsonl files and the databases (e.g. "blinds" in Scenes.json does not match "blinds" in QueryScene-level-1-1.jsonl).
Random numbers are never seeded (e.g. appointment_id = str(random.randint(10000000, 99999999)) in appointment_registration.py) so will change across runs making things incomparable.

Lucaweihs commented 3 months ago

Another:

In EmergencyKnowledge-ModifyRegistration-RecordHealthData-level-2-3.jsonl, the health_data field is corrupt as it's value is a string rather than a json object.

AlibabaResearch / DAMO-ConvAI

QueryHistoryToday description mismatch #142

Issue

Suggested fixes