Running Groq using llama3 model keep getting un-formatted output

HEYBOY789 commented 5 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

llama3_groq_model = ChatGroq(temperature=0, groq_api_key="gsk_")

def run_tools(query):
    serp_search_tool =  serp_search
    tools = [serp_search_tool]
    tools_by_name = {tool.name:tool for tool in tools}
    tool_calls=[]

    while True:
          model = llama3_groq_model
          llm_with_tool = model.bind_tools(tools)
          res = llm_with_tool.invoke(query)
          tool_calls = res.tool_calls
          break

    if tool_calls:
        name = tool_calls[-1]['name']
        args = tool_calls[-1]['args']
        print(f'Running Tool {name}...')
        rs = tools_by_name[name].invoke(args)
    else:
        rs = res.content
        name = ''
        args = {}

    return {'result': rs, 'last_tool_calls':tool_calls}

Error Message and Stack Trace (if applicable)

Expected response: content='' additional_kwargs={'tool_calls': [{'id': 'call_7d3a', 'function': {'arguments': '{"keyword":"đài quảng bình"}', 'name': 'serp_search'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_time': 0.128152054, 'completion_tokens': 47, 'prompt_time': 0.197270744, 'prompt_tokens': 932, 'queue_time': None, 'total_time': 0.32542279799999996, 'total_tokens': 979}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_c1a4bcec29', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-faa4fae3-93ab-4a13-8e5b-9e2269c1594f-0' tool_calls=[{'name': 'serp_search', 'args': {'keyword': 'đài quảng bình'}, 'id': 'call_7d3a'}]

Not expected response: content='assistant<|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|><|start_header_id|>' response_metadata={'token_usage': {'completion_time': 11.079835881, 'completion_tokens': 4000, 'prompt_time': 0.165631979, 'prompt_tokens': 935, 'queue_time': None, 'total_time': 11.24546786, 'total_tokens': 4935}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_2f30b0b571', 'finish_reason': 'length', 'logprobs': None} id='run-a89e17f6-bea2-4db6-8969-94820098f2dc-0'

Description

when the model call tool, it sometime return response as expected, but sometime it will response like the error above.

The <|start_header_id|> is duplicate multiple time and it take too much time to finish. I have to wait for the response, do a check and rerun the process to get the right response.

Not only that, the success rate of normal chain which involve prompt, parser is low to. Good thing is lang graph do the rerun so i don't have to worry much about it, but it still take a lot of time.

I only encountered this problem since yesterday. Sooner than that, it worked flawlessly.

I updated langchain, langgraph, langsmith and langchain_groq to the last version yesterday. I think it cause the problem.

System Info

aiohttp==3.9.5 aiosignal==1.3.1 alembic==1.13.1 annotated-types==0.6.0 anthropic==0.28.1 anyio==4.3.0 appdirs==1.4.4 asgiref==3.8.1 asttokens==2.4.1 async-timeout==4.0.3 attrs==23.2.0 Babel==2.15.0 backoff==2.2.1 bcrypt==4.1.3 beautifulsoup4==4.12.3 blinker==1.8.2 boto3==1.34.127 botocore==1.34.127 Brotli==1.1.0 bs4==0.0.2 build==1.2.1 cachetools==5.3.3 catalogue==2.0.10 certifi==2024.2.2 cffi==1.16.0 charset-normalizer==3.3.2 chroma-hnswlib==0.7.3 chromadb==0.4.24 click==8.1.7 coloredlogs==15.0.1 comm==0.2.2 courlan==1.1.0 crewai==0.28.8 crewai-tools==0.2.3 cryptography==42.0.6 dataclasses-json==0.6.5 dateparser==1.2.0 debugpy==1.8.1 decorator==5.1.1 defusedxml==0.7.1 Deprecated==1.2.14 deprecation==2.1.0 dirtyjson==1.0.8 distro==1.9.0 docstring-parser==0.15 embedchain==0.1.102 exceptiongroup==1.2.1 executing==2.0.1 faiss-cpu==1.8.0 faiss-gpu==1.7.2 fast-pytorch-kmeans==0.2.0.1 fastapi==0.110.3 filelock==3.14.0 flatbuffers==24.3.25 free-proxy==1.1.1 frozenlist==1.4.1 fsspec==2024.3.1 git-python==1.0.3 gitdb==4.0.11 GitPython==3.1.43 google==3.0.0 google-ai-generativelanguage==0.6.4 google-api-core==2.19.0 google-api-python-client==2.133.0 google-auth==2.29.0 google-auth-httplib2==0.2.0 google-cloud-aiplatform==1.50.0 google-cloud-bigquery==3.21.0 google-cloud-core==2.4.1 google-cloud-resource-manager==1.12.3 google-cloud-storage==2.16.0 google-crc32c==1.5.0 google-generativeai==0.5.4 google-resumable-media==2.7.0 googleapis-common-protos==1.63.0 gptcache==0.1.43 graphviz==0.20.3 greenlet==3.0.3 groq==0.5.0 grpc-google-iam-v1==0.13.0 grpcio==1.63.0 grpcio-status==1.62.2 h11==0.14.0 html2text==2024.2.26 htmldate==1.8.1 httpcore==1.0.5 httplib2==0.22.0 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.0 humanfriendly==10.0 idna==3.7 importlib-metadata==7.0.0 importlib_resources==6.4.0 iniconfig==2.0.0 instructor==0.5.2 ipykernel==6.29.4 ipython==8.24.0 itsdangerous==2.2.0 jedi==0.19.1 Jinja2==3.1.3 jiter==0.4.2 jmespath==1.0.1 joblib==1.4.2 jsonpatch==1.33 jsonpointer==2.4 jupyter_client==8.6.1 jupyter_core==5.7.2 jusText==3.0.0 kubernetes==29.0.0 lancedb==0.5.7 langchain==0.2.5 langchain-anthropic==0.1.11 langchain-aws==0.1.3 langchain-chroma==0.1.0 langchain-community==0.2.5 langchain-core==0.2.8 langchain-experimental==0.0.60 langchain-google-genai==1.0.3 langchain-groq==0.1.5 langchain-openai==0.1.6 langchain-text-splitters==0.2.1 langchainhub==0.1.18 langgraph==0.0.57 langsmith==0.1.80 lark==1.1.9 llama-index==0.10.36 llama-index-agent-openai==0.2.4 llama-index-cli==0.1.12 llama-index-embeddings-openai==0.1.9 llama-index-indices-managed-llama-cloud==0.1.6 llama-index-llms-openai==0.1.18 llama-index-multi-modal-llms-openai==0.1.5 llama-index-program-openai==0.1.6 llama-index-question-gen-openai==0.1.3 llama-index-readers-file==0.1.22 llama-index-readers-llama-parse==0.1.4 llama-parse==0.4.2 lxml==5.1.1 Mako==1.3.3 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.2 matplotlib-inline==0.1.7 mdurl==0.1.2 minify_html==0.15.0 mmh3==4.1.0 monotonic==1.6 mpmath==1.3.0 multidict==6.0.5 mutagen==1.47.0 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.3 nodeenv==1.8.0 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.4.127 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.2 onnxruntime==1.17.3 openai==1.25.1 opentelemetry-api==1.24.0 opentelemetry-exporter-otlp-proto-common==1.24.0 opentelemetry-exporter-otlp-proto-grpc==1.24.0 opentelemetry-exporter-otlp-proto-http==1.24.0 opentelemetry-instrumentation==0.45b0 opentelemetry-instrumentation-asgi==0.45b0 opentelemetry-instrumentation-fastapi==0.45b0 opentelemetry-proto==1.24.0 opentelemetry-sdk==1.24.0 opentelemetry-semantic-conventions==0.45b0 opentelemetry-util-http==0.45b0 orjson==3.10.2 outcome==1.3.0.post0 overrides==7.7.0 packaging==23.2 pandas==2.2.2 parso==0.8.4 pexpect==4.9.0 pillow==10.3.0 platformdirs==4.2.1 playwright==1.43.0 pluggy==1.5.0 posthog==3.5.0 prompt-toolkit==3.0.43 proto-plus==1.23.0 protobuf==4.25.3 psutil==5.9.8 ptyprocess==0.7.0 pulsar-client==3.5.0 pure-eval==0.2.2 py==1.11.0 pyarrow==16.0.0 pyarrow-hotfix==0.6 pyasn1==0.6.0 pyasn1_modules==0.4.0 pycparser==2.22 pycryptodomex==3.20.0 pydantic==2.7.1 pydantic_core==2.18.2 pyee==11.1.0 PyGithub==1.59.1 Pygments==2.18.0 PyJWT==2.8.0 pylance==0.9.18 PyNaCl==1.5.0 pyparsing==3.1.2 pypdf==4.2.0 PyPika==0.48.9 pyproject_hooks==1.1.0 pyright==1.1.361 pysbd==0.3.4 PySocks==1.7.1 pytest==8.2.0 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 pytube==15.0.0 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.3 random-user-agent==1.0.1 rank-bm25==0.2.2 ratelimiter==1.2.0.post0 redis==5.0.4 regex==2023.12.25 requests==2.31.0 requests-file==2.0.0 requests-oauthlib==2.0.0 retry==0.9.2 rich==13.7.1 rsa==4.9 s3transfer==0.10.1 safetensors==0.4.3 schema==0.7.7 scikit-learn==1.4.2 scipy==1.13.0 selenium==4.20.0 semver==3.0.2 sentence-transformers==2.7.0 shapely==2.0.4 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 sortedcontainers==2.4.0 soupsieve==2.5 SQLAlchemy==2.0.29 stack-data==0.6.3 starlette==0.37.2 striprtf==0.0.26 sympy==1.12 tavily-python==0.3.3 tenacity==8.2.3 threadpoolctl==3.5.0 tiktoken==0.6.0 tld==0.13 tldextract==5.1.2 tokenizers==0.19.1 tomli==2.0.1 torch==2.3.0 tornado==6.4 tqdm==4.66.4 trafilatura==1.9.0 traitlets==5.14.3 transformers==4.40.1 trio==0.25.0 trio-websocket==0.11.1 triton==2.3.0 typer==0.9.4 types-requests==2.32.0.20240602 typing-inspect==0.9.0 typing_extensions==4.11.0 tzdata==2024.1 tzlocal==5.2 ujson==5.9.0 undetected-playwright==0.3.0 uritemplate==4.1.1 urllib3==2.2.1 uuid6==2024.1.12 uvicorn==0.29.0 uvloop==0.19.0 watchfiles==0.21.0 wcwidth==0.2.13 websocket-client==1.8.0 websockets==12.0 wrapt==1.16.0 wsproto==1.2.0 yarl==1.9.4 youtube-transcript-api==0.6.2 yt-dlp==2023.12.30 zipp==3.18.1

platform: Ubuntu 22.04 LTS python version: 3.10.12

keenborder786 commented 5 months ago

Please use AgentExecutor and also try to use JSON Chat Agent. In your provided code, I don't see any specific prompts indicating the type of the agent that you are using.

HEYBOY789 commented 5 months ago

Please use AgentExecutor and also try to use JSON Chat Agent. In your provided code, I don't see any specific prompts indicating the type of the agent that you are using.

I just use LLM with tools to decide whether to use tools or not, similar to the LangGraph tutorial.

Here's the line of code: llm_with_tool = model.bind_tools(tools). This will return a tool message if the LLM decides to use tools, or a normal message if it decides to respond on its own.

However, the issue isn't because I'm not using AgentExecutor. The problem lies within the Groq LLM module itself. This issue occurs even when I use the LCEL chain, not just with the agent. I tested my prompt on the Groq homepage chat UI and didn't encounter this problem. I even tested my prompt with LM Studio running Llama3-7b, thinking the issue might be with that model, but there were no errors. Therefore, the only remaining possibility is that the problem is with the Groq module itself.

langchain-ai / langchain