[Issue]: Not able to call tool with llama 3.1 (from Groq) model in AutoGen

jaygdesai commented 3 months ago

When I execute this (from documentation, but with Groq model): chat_result = user_proxy.initiate_chat(assistant, message="What is (44232 + 13312 / (232 - 32)) * 5?")

I get following reply:

User (to Assistant):

What is (44232 + 13312 / (232 - 32)) * 5?

USING AUTO REPLY...

BadRequestError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py in create(self, params) 154 try: --> 155 response = client.chat.completions.create(**groq_params) 156 except Exception as e:

12 frames BadRequestError: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '\n{\n "tool_calls": [\n {\n "id": "pending",\n "type": "function",\n "function": {\n "name": "subtract",\n "parameters": {\n "a": "232",\n "b": "32"\n }\n }\n }\n ]\n}\n'}}

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py in create(self, params) 155 response = client.chat.completions.create(**groq_params) 156 except Exception as e: --> 157 raise RuntimeError(f"Groq exception occurred: {e}") 158 else: 159

RuntimeError: Groq exception occurred: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '\n{\n "tool_calls": [\n {\n "id": "pending",\n "type": "function",\n "function": {\n "name": "subtract",\n "parameters": {\n "a": "232",\n "b": "32"\n }\n }\n }\n ]\n}\n'}}

It identifies the call correctly. However, fails to execute. My code is able to call with Cohere (and hopefully OpenAI (But haven't tried)). But, Groq gives above message.

Is there anything I am missing?

Steps to reproduce

# In Google Colab:

!apt install python3.10-venv
!python3 -m venv pyautogen
!source pyautogen/bin/activate
!pip install pyautogen
!pip install pyautogen["groq"]
!pip install python-dotenv

from dotenv import load_dotenv, find_dotenv

import os
from autogen import ConversableAgent
import autogen
import groq

_ = load_dotenv(find_dotenv())

GROQ_API_KEY = os.environ["GROQ_API_KEY"]
print("GROQ_API_KEY = ", GROQ_API_KEY)

config_list = [
    {
      "model": "llama-3.1-70b-versatile",
      "api_key": GROQ_API_KEY,
      "api_type": "groq"
    }
]

def add(a: int, b: int) -> int:
    return a + b

def subtract(a: int, b: int) -> int:
    return a - b

def multiply(a: int, b: int) -> int:
    return a * b

def divide(a: int, b: int) -> int:
    return a / b

import os

from autogen import ConversableAgent

# Let's first define the assistant agent that suggests tool calls.
assistant = ConversableAgent(
    name="Assistant",
    system_message="You are a helpful AI assistant. "
    "You can help with simple calculations. "
    "Return 'TERMINATE' when the task is done.",
    llm_config={"config_list": config_list},
)

# The user proxy agent is used for interacting with the assistant agent
# and executes tool calls.
user_proxy = ConversableAgent(
    name="User",
    llm_config=False,
    is_termination_msg=lambda msg: msg.get("content") is not None and "TERMINATE" in msg["content"],
    human_input_mode="NEVER",
)

# Register the tool signature with the assistant agent.
assistant.register_for_llm(name="add", description="Add two numbers")(add)
assistant.register_for_llm(name="subtract", description="subtract b from a")(subtract)
assistant.register_for_llm(name="multiply", description="Multiply two numbers")(multiply)
assistant.register_for_llm(name="divide", description="Divide a by b")(divide)

# Register the tool function with the user proxy agent.
user_proxy.register_for_execution(name="add")(add)
user_proxy.register_for_execution(name="subtract")(subtract)
user_proxy.register_for_execution(name="multiply")(multiply)
user_proxy.register_for_execution(name="divide")(divide)

chat_result = user_proxy.initiate_chat(assistant, message="What is (44232 + 13312 / (232 - 32)) * 5?")

Screenshots and logs

No response

Additional Information

Package Version

absl-py 1.4.0 accelerate 0.32.1 aiohttp 3.9.5 aiosignal 1.3.1 alabaster 0.7.16 albumentations 1.3.1 altair 4.2.2 annotated-types 0.7.0 anyio 3.7.1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 array_record 0.5.1 arviz 0.15.1 astropy 5.3.4 astunparse 1.6.3 async-timeout 4.0.3 atpublic 4.1.0 attrs 23.2.0 audioread 3.0.1 autograd 1.6.2 Babel 2.15.0 backcall 0.2.0 beautifulsoup4 4.12.3 bidict 0.23.1 bigframes 1.11.1 bleach 6.1.0 blinker 1.4 blis 0.7.11 blosc2 2.0.0 bokeh 3.3.4 bqplot 0.12.43 branca 0.7.2 build 1.2.1 CacheControl 0.14.0 cachetools 5.4.0 catalogue 2.0.10 certifi 2024.7.4 cffi 1.16.0 chardet 5.2.0 charset-normalizer 3.3.2 chex 0.1.86 clarabel 0.9.0 click 8.1.7 click-plugins 1.1.1 cligj 0.7.2 cloudpathlib 0.18.1 cloudpickle 2.2.1 cmake 3.27.9 cmdstanpy 1.2.4 colorcet 3.1.0 colorlover 0.3.0 colour 0.1.5 community 1.0.0b1 confection 0.1.5 cons 0.4.6 contextlib2 21.6.0 contourpy 1.2.1 cryptography 43.0.0 cuda-python 12.2.1 cudf-cu12 24.4.1 cufflinks 0.17.3 cupy-cuda12x 12.2.0 cvxopt 1.3.2 cvxpy 1.5.2 cycler 0.12.1 cymem 2.0.8 Cython 3.0.10 dask 2023.8.1 datascience 0.17.6 db-dtypes 1.2.0 dbus-python 1.2.18 debugpy 1.6.6 decorator 4.4.2 defusedxml 0.7.1 diskcache 5.6.3 distributed 2023.8.1 distro 1.7.0 dlib 19.24.4 dm-tree 0.1.8 docker 7.1.0 docstring_parser 0.16 docutils 0.18.1 dopamine_rl 4.0.9 duckdb 0.10.3 earthengine-api 0.1.412 easydict 1.13 ecos 2.0.14 editdistance 0.6.2 eerepr 0.0.4 en-core-web-sm 3.7.1 entrypoints 0.4 et-xmlfile 1.1.0 etils 1.7.0 etuples 0.3.9 exceptiongroup 1.2.2 fastai 2.7.15 fastcore 1.5.54 fastdownload 0.0.7 fastjsonschema 2.20.0 fastprogress 1.0.3 fastrlock 0.8.2 filelock 3.15.4 fiona 1.9.6 firebase-admin 5.3.0 FLAML 2.1.2 Flask 2.2.5 flatbuffers 24.3.25 flax 0.8.4 folium 0.14.0 fonttools 4.53.1 frozendict 2.4.4 frozenlist 1.4.1 fsspec 2023.6.0 future 0.18.3 gast 0.6.0 gcsfs 2023.6.0 GDAL 3.6.4 gdown 5.1.0 geemap 0.33.1 gensim 4.3.3 geocoder 1.38.1 geographiclib 2.0 geopandas 0.13.2 geopy 2.3.0 gin-config 0.5.0 glob2 0.7 google 2.0.3 google-ai-generativelanguage 0.6.6 google-api-core 2.19.1 google-api-python-client 2.137.0 google-auth 2.27.0 google-auth-httplib2 0.2.0 google-auth-oauthlib 1.2.1 google-cloud-aiplatform 1.59.0 google-cloud-bigquery 3.25.0 google-cloud-bigquery-connection 1.15.4 google-cloud-bigquery-storage 2.25.0 google-cloud-bigtable 2.25.0 google-cloud-core 2.4.1 google-cloud-datastore 2.19.0 google-cloud-firestore 2.16.1 google-cloud-functions 1.16.4 google-cloud-iam 2.15.1 google-cloud-language 2.13.4 google-cloud-pubsub 2.22.0 google-cloud-resource-manager 1.12.4 google-cloud-storage 2.8.0 google-cloud-translate 3.15.4 google-colab 1.0.0 google-crc32c 1.5.0 google-generativeai 0.7.2 google-pasta 0.2.0 google-resumable-media 2.7.1 googleapis-common-protos 1.63.2 googledrivedownloader 0.4 graphviz 0.20.3 greenlet 3.0.3 groq 0.9.0 grpc-google-iam-v1 0.13.1 grpcio 1.64.1 grpcio-status 1.48.2 gspread 6.0.2 gspread-dataframe 3.3.1 gym 0.25.2 gym-notices 0.0.8 h11 0.14.0 h5netcdf 1.3.0 h5py 3.9.0 holidays 0.53 holoviews 1.17.1 html5lib 1.1 httpcore 1.0.5 httpimport 1.3.1 httplib2 0.22.0 httpx 0.27.0 huggingface-hub 0.23.5 humanize 4.7.0 hyperopt 0.2.7 ibis-framework 8.0.0 idna 3.7 imageio 2.31.6 imageio-ffmpeg 0.5.1 imagesize 1.4.1 imbalanced-learn 0.10.1 imgaug 0.4.0 immutabledict 4.2.0 importlib_metadata 8.0.0 importlib_resources 6.4.0 imutils 0.5.4 inflect 7.0.0 iniconfig 2.0.0 intel-openmp 2023.2.4 ipyevents 2.0.2 ipyfilechooser 0.6.0 ipykernel 5.5.6 ipyleaflet 0.18.2 ipyparallel 8.8.0 ipython 7.34.0 ipython-genutils 0.2.0 ipython-sql 0.5.0 ipytree 0.2.2 ipywidgets 7.7.1 itsdangerous 2.2.0 jax 0.4.26 jaxlib 0.4.26+cuda12.cudnn89 jeepney 0.7.1 jellyfish 1.0.4 jieba 0.42.1 Jinja2 3.1.4 joblib 1.4.2 jsonpickle 3.2.2 jsonschema 4.19.2 jsonschema-specifications 2023.12.1 jupyter-client 6.1.12 jupyter-console 6.1.0 jupyter_core 5.7.2 jupyter-server 1.24.0 jupyterlab_pygments 0.3.0 jupyterlab_widgets 3.0.11 kaggle 1.6.14 kagglehub 0.2.8 keras 2.15.0 keyring 23.5.0 kiwisolver 1.4.5 langcodes 3.4.0 language_data 1.2.0 launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6 lazy_loader 0.4 libclang 18.1.1 librosa 0.10.2.post1 lightgbm 4.1.0 linkify-it-py 2.0.3 llvmlite 0.41.1 locket 1.0.0 logical-unification 0.4.6 lxml 4.9.4 malloy 2023.1067 marisa-trie 1.2.0 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.7.1 matplotlib-inline 0.1.7 matplotlib-venn 0.11.10 mdit-py-plugins 0.4.1 mdurl 0.1.2 miniKanren 1.0.3 missingno 0.5.2 mistune 0.8.4 mizani 0.9.3 mkl 2023.2.0 ml-dtypes 0.2.0 mlxtend 0.22.0 more-itertools 10.1.0 moviepy 1.0.3 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multipledispatch 1.0.0 multitasking 0.0.11 murmurhash 1.0.10 music21 9.1.0 natsort 8.4.0 nbclassic 1.1.0 nbclient 0.10.0 nbconvert 6.5.4 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.3 nibabel 4.0.2 nltk 3.8.1 notebook 6.5.5 notebook_shim 0.2.4 numba 0.58.1 numexpr 2.10.1 numpy 1.25.2 nvtx 0.2.10 oauth2client 4.1.3 oauthlib 3.2.2 openai 1.37.1 opencv-contrib-python 4.8.0.76 opencv-python 4.8.0.76 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 optax 0.2.2 orbax-checkpoint 0.4.4 osqp 0.6.2.post8 packaging 24.1 pandas 2.0.3 pandas-datareader 0.10.0 pandas-gbq 0.19.2 pandas-stubs 2.0.3.230814 pandocfilters 1.5.1 panel 1.3.8 param 2.1.1 parso 0.8.4 parsy 2.1 partd 1.4.2 pathlib 1.0.1 patsy 0.5.6 peewee 3.17.6 pexpect 4.9.0 pickleshare 0.7.5 Pillow 9.4.0 pip 24.1.2 pip-tools 7.4.1 platformdirs 4.2.2 plotly 5.15.0 plotnine 0.12.4 pluggy 1.5.0 polars 0.20.2 pooch 1.8.2 portpicker 1.5.2 prefetch_generator 1.0.3 preshed 3.0.9 prettytable 3.10.2 proglog 0.1.10 progressbar2 4.2.0 prometheus_client 0.20.0 promise 2.3 prompt_toolkit 3.0.47 prophet 1.1.5 proto-plus 1.24.0 protobuf 3.20.3 psutil 5.9.5 psycopg2 2.9.9 ptyprocess 0.7.0 py-cpuinfo 9.0.0 py4j 0.10.9.7 pyarrow 14.0.2 pyarrow-hotfix 0.6 pyasn1 0.6.0 pyasn1_modules 0.4.0 pyautogen 0.2.32 pycocotools 2.0.8 pycparser 2.22 pydantic 2.8.2 pydantic_core 2.20.1 pydata-google-auth 1.8.2 pydot 1.4.2 pydot-ng 2.0.0 pydotplus 2.0.2 PyDrive 1.3.1 PyDrive2 1.6.3 pyerfa 2.0.1.4 pygame 2.6.0 Pygments 2.16.1 PyGObject 3.42.1 PyJWT 2.3.0 pymc 5.10.4 pymystem3 0.2.0 pynvjitlink-cu12 0.3.0 PyOpenGL 3.1.7 pyOpenSSL 24.2.1 pyparsing 3.1.2 pyperclip 1.9.0 pyproj 3.6.1 pyproject_hooks 1.1.0 pyshp 2.3.1 PySocks 1.7.1 pytensor 2.18.6 pytest 7.4.4 python-apt 2.4.0 python-box 7.2.0 python-dateutil 2.8.2 python-dotenv 1.0.1 python-louvain 0.16 python-slugify 8.0.4 python-utils 3.8.2 pytz 2023.4 pyviz_comms 3.0.2 PyWavelets 1.6.0 PyYAML 6.0.1 pyzmq 24.0.1 qdldl 0.1.7.post4 qudida 0.0.4 ratelim 0.1.6 referencing 0.35.1 regex 2024.5.15 requests 2.31.0 requests-oauthlib 1.3.1 requirements-parser 0.9.0 rich 13.7.1 rmm-cu12 24.4.0 rpds-py 0.19.0 rpy2 3.4.2 rsa 4.9 safetensors 0.4.3 scikit-image 0.19.3 scikit-learn 1.2.2 scipy 1.11.4 scooby 0.10.0 scs 3.2.6 seaborn 0.13.1 SecretStorage 3.3.1 Send2Trash 1.8.3 sentencepiece 0.1.99 setuptools 71.0.4 shapely 2.0.5 shellingham 1.5.4 simple_parsing 0.1.5 six 1.16.0 sklearn-pandas 2.2.0 smart-open 7.0.4 sniffio 1.3.1 snowballstemmer 2.2.0 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.5 soxr 0.3.7 spacy 3.7.5 spacy-legacy 3.0.12 spacy-loggers 1.0.5 Sphinx 5.0.2 sphinxcontrib-applehelp 1.0.8 sphinxcontrib-devhelp 1.0.6 sphinxcontrib-htmlhelp 2.0.6 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.8 sphinxcontrib-serializinghtml 1.1.10 SQLAlchemy 2.0.31 sqlglot 20.11.0 sqlparse 0.5.1 srsly 2.4.8 stanio 0.5.1 statsmodels 0.14.2 StrEnum 0.4.15 sympy 1.13.1 tables 3.8.0 tabulate 0.9.0 tbb 2021.13.0 tblib 3.0.0 tenacity 8.5.0 tensorboard 2.15.2 tensorboard-data-server 0.7.2 tensorflow 2.15.0 tensorflow-datasets 4.9.6 tensorflow-estimator 2.15.0 tensorflow-gcs-config 2.15.0 tensorflow-hub 0.16.1 tensorflow-io-gcs-filesystem 0.37.1 tensorflow-metadata 1.15.0 tensorflow-probability 0.23.0 tensorstore 0.1.45 termcolor 2.4.0 terminado 0.18.1 text-unidecode 1.3 textblob 0.17.1 tf_keras 2.15.1 tf-slim 1.1.0 thinc 8.2.5 threadpoolctl 3.5.0 tifffile 2024.7.21 tiktoken 0.7.0 tinycss2 1.3.0 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.1 toolz 0.12.1 torch 2.3.1+cu121 torchaudio 2.3.1+cu121 torchsummary 1.5.1 torchtext 0.18.0 torchvision 0.18.1+cu121 tornado 6.3.3 tqdm 4.66.4 traitlets 5.7.1 traittypes 0.2.1 transformers 4.42.4 triton 2.3.1 tweepy 4.14.0 typer 0.12.3 types-pytz 2024.1.0.20240417 types-setuptools 71.1.0.20240724 typing_extensions 4.12.2 tzdata 2024.1 tzlocal 5.2 uc-micro-py 1.0.3 uritemplate 4.1.1 urllib3 2.0.7 vega-datasets 0.9.0 wadllib 1.3.6 wasabi 1.1.3 wcwidth 0.2.13 weasel 0.4.1 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 Werkzeug 3.0.3 wheel 0.43.0 widgetsnbextension 3.6.7 wordcloud 1.9.3 wrapt 1.14.1 xarray 2023.7.0 xarray-einstats 0.7.0 xgboost 2.0.3 xlrd 2.0.1 xyzservices 2024.6.0 yarl 1.9.4 yellowbrick 1.5 yfinance 0.2.41 zict 3.0.0 zipp 3.19.2

marklysze commented 3 months ago

Hey @jaygdesai, this looks like a Llama 3.1 issue, I've noticed that it can put double quotes around the numeric parameters: "parameters": {\n "a": "232",\n "b": "32"\n } That should be: "parameters": {\n "a": 232,\n "b": 32\n }.

I've found that I need to add an example to the assistant's system_message, going from this:

assistant = ConversableAgent(
    name="Assistant",
    system_message="You are a helpful AI assistant. "
    "You can help with simple calculations. "
    "Return 'TERMINATE' when the task is done.",
    llm_config={"config_list": config_list},
)

to this:

assistant = ConversableAgent(
    name="Assistant",
    system_message="""You are a helpful AI assistant.
    You can help with simple calculations. 
    Example of the return JSON is:
        {
            "parameter_1_name": 100.00,
            "parameter_2_name": "ABC",
            "parameter_3_name": "DEF",
        }.
        Another example of the return JSON is:
        {
            "parameter_1_name": "GHI",
            "parameter_2_name": "ABC",
            "parameter_3_name": "DEF",
            "parameter_4_name": 123.00,
        }.
    Return 'TERMINATE' when the task is done.""",
    llm_config={"config_list": config_list},
)

Can you try that?

If that still results in strings for the numeric values try converting strings to integers in your calculation functions.

gauravdhiman commented 3 months ago

@jaygdesai I think it will be interesting to look into what tools config is getting passed to LLM. It may not be making clear to LLM the type of parameters. If that is the case, you should use typing to ensure that parameter types are well defined in tools config that get passed to LLM. LLM should honor the function signature.

jaygdesai commented 3 months ago

Hey @jaygdesai, this looks like a Llama 3.1 issue, I've noticed that it can put double quotes around the numeric parameters: "parameters": {\n "a": "232",\n "b": "32"\n } That should be: "parameters": {\n "a": 232,\n "b": 32\n }.

I've found that I need to add an example to the assistant's system_message, going from this:
assistant = ConversableAgent(
    name="Assistant",
    system_message="You are a helpful AI assistant. "
    "You can help with simple calculations. "
    "Return 'TERMINATE' when the task is done.",
    llm_config={"config_list": config_list},
)
to this:
assistant = ConversableAgent(
    name="Assistant",
    system_message="""You are a helpful AI assistant.
    You can help with simple calculations. 
    Example of the return JSON is:
        {
            "parameter_1_name": 100.00,
            "parameter_2_name": "ABC",
            "parameter_3_name": "DEF",
        }.
        Another example of the return JSON is:
        {
            "parameter_1_name": "GHI",
            "parameter_2_name": "ABC",
            "parameter_3_name": "DEF",
            "parameter_4_name": 123.00,
        }.
    Return 'TERMINATE' when the task is done.""",
    llm_config={"config_list": config_list},
)
Can you try that?

If that still results in strings for the numeric values try converting strings to integers in your calculation functions.

When I tried suggested approach, I still got following reply.

User (to Assistant):

What is (44232 + 13312 / (232 - 32)) * 5?

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
---------------------------------------------------------------------------
BadRequestError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py](https://localhost:8080/#) in create(self, params)
    154         try:
--> 155             response = client.chat.completions.create(**groq_params)
    156         except Exception as e:

12 frames
BadRequestError: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '<tool-use>\n{\n\t"tool_calls": [\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "add",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "44232",\n\t\t\t\t\t"b": "13312 / (232 - 32)"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "subtract",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "232",\n\t\t\t\t\t"b": "32"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "divide",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "13312",\n\t\t\t\t\t"b": "200"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "multiply",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "44232 + 66.2",\n\t\t\t\t\t"b": "5"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t}\n\t]\n}\n</tool-use>'}}

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py](https://localhost:8080/#) in create(self, params)
    155             response = client.chat.completions.create(**groq_params)
    156         except Exception as e:
--> 157             raise RuntimeError(f"Groq exception occurred: {e}")
    158         else:
    159 

RuntimeError: Groq exception occurred: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '<tool-use>\n{\n\t"tool_calls": [\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "add",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "44232",\n\t\t\t\t\t"b": "13312 / (232 - 32)"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "subtract",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "232",\n\t\t\t\t\t"b": "32"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "divide",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "13312",\n\t\t\t\t\t"b": "200"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t},\n\t\t{\n\t\t\t"id": "pending",\n\t\t\t"type": "function",\n\t\t\t"function": {\n\t\t\t\t"name": "multiply",\n\t\t\t\t"parameters": {\n\t\t\t\t\t"a": "44232 + 66.2",\n\t\t\t\t\t"b": "5"\n\t\t\t\t}\n\t\t\t},\n\t\t\t"parameters": {}\n\t\t}\n\t]\n}\n</tool-use>'}}

And when I reverted to my previous assistant initialization and changed function parameters to accept str, I get following error:

User (to Assistant):

What is (44232 + 13312 / (232 - 32)) * 5?

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
---------------------------------------------------------------------------
BadRequestError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py](https://localhost:8080/#) in create(self, params)
    154         try:
--> 155             response = client.chat.completions.create(**groq_params)
    156         except Exception as e:

12 frames
BadRequestError: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '<tool-use>\n{\n  "tool_calls": [\n    {\n      "id": "pending",\n      "type": "function",\n      "function": {\n        "name": "add",\n        "parameters": {\n          "a": "44232",\n          "b": "5390"\n        }\n      }\n    }\n  ]\n}\n</tool-use>'}}

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/autogen/oai/groq.py](https://localhost:8080/#) in create(self, params)
    155             response = client.chat.completions.create(**groq_params)
    156         except Exception as e:
--> 157             raise RuntimeError(f"Groq exception occurred: {e}")
    158         else:
    159 

RuntimeError: Groq exception occurred: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '<tool-use>\n{\n  "tool_calls": [\n    {\n      "id": "pending",\n      "type": "function",\n      "function": {\n        "name": "add",\n        "parameters": {\n          "a": "44232",\n          "b": "5390"\n        }\n      }\n    }\n  ]\n}\n</tool-use>'}}

In above case, function definitions were changed to:

def add(a: str, b: str) -> int:
    ai = int(a)
    bi = int(b)
    return ai + bi

def subtract(a: str, b: str) -> int:
    ai = int(a)
    bi = int(b)
    return ai - bi

def multiply(a: str, b: str) -> int:
    ai = int(a)
    bi = int(b)
    return ai * bi

def divide(a: str, b: str) -> int:
    ai = int(a)
    bi = int(b)
    return ai / bi

jaygdesai commented 3 months ago

@jaygdesai I think it will be interesting to look into what tools config is getting passed to LLM. It may not be making clear to LLM the type of parameters. If that is the case, you should use typing to ensure that parameter types are well defined in tools config that get passed to LLM. LLM should honor the function signature.

Can you let me know how to view tool config passed to LLM from AutoGen. Also, in function definitions I have already included types of parameters.

marklysze commented 3 months ago

Hey @jaygdesai, if you're able to put a breakpoint at this line in groq.py, have a look at the groq_params and you'll see everything that's going to groq.

If you can't put a breakpoint in there, try adding a line before it with: print(groq_params)

Feel free to post it here as well.

gauravdhiman commented 3 months ago

@jaygdesai in addition to what @marklysze suggested, you can also print agent.function_map to see what function details are getting passed.

jaygdesai commented 3 months ago

I recently run the code again. No error! Most likely, correction in llama / groq layers. I think it is a challenge to predict the behaviour with API calls to any hosted LLM, where LLM implementation/version may be changing. Something tested today, may not be valid tomorrow. It would be great if companies provide check-pointed LLMs (for inference) so that application developers can try different prompts to get their implemention right and can product similar results in production.

jaygdesai commented 3 months ago

Closing the issue as the bug is not reproduced.

microsoft / autogen