Open minmin-intel opened 2 months ago
The llama3.1 is new model for TGI, so the chat template support is also new for TGI. Our TGI is only 2.0.4 version. The fixed patch is included inPR 2454. We need upgrade the tgi-gaudi to latest. This will take some time.
https://github.com/huggingface/tgi-gaudi/pull/225 The patch is under review.
System Info
TGI-gaudi 2.0.4 docker image. Model = meta-llama/Meta-Llama-3.1-70B-Instruct HW = Gaudi2, 4 cards python 3.11 langchain 0.2.12 langchain-core 0.2.28 langchain-huggingface 0.0.3 huggingface-hub 0.24.0
Information
Tasks
Reproduction
I was using tgi-gaudi to serve llama3.1-7-B-instruct as the llm engine for my agent application. After the agent runs through one round of user/assistant (with tool calls)/tool/user messages, when need to generate the next AI message by llm, tgi-gaudi gives 422 client error - Template error: unknown test: test iterable is unknown (in:99) . See example below. As you can see, there is no "test" in the messages sent to TGI-gaudi. So I'm suspecting this issue is caused by something internal to TGI-gaudi.
You can use the code here to reproduce this issue.
I used the command below to launch tgi-gaudi docker run -d --name tgi-server -p 8085:80 -v $WORKDIR/hf_cache:/data --runtime=habana -e HF_TOKEN=$HF_TOKEN -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e HABANA_VISIBLE_DEVICES=0,1,2,3 -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host ghcr.io/huggingface/tgi-gaudi:2.0.4 --model-id meta-llama/Meta-Llama-3.1-70B-Instruct --sharded true --num-shard 4 --max-input-tokens 4096 --max-total-tokens 8192
Example error output: message sent to TGI: [{'role': 'user', 'content': 'Question: how many songs has the band the beatles released that have been recorded at abbey road studios? \nThe question was asked at: 03/21/2024, 23:37:29 PT'}, {'role': 'assistant', 'content': '', 'tool_calls': [{'function': {'name': 'search_knowledge_base', ' arguments': '{"query": "The Beatles songs recorded at Abbey Road Studios"}'}}]}, {'role': 'tool', 'content': 'Personnel[edit]\nWritten and composed by Rod Te mperton\nProduced by Quincy Jones\nMichael Jackson: lead and background vocals, LinnDrum drum machine\nRod Temperton and Brian Banks: synthesizers\nGreg Phil linganes: synthesizers, Rhodes piano\nAnthony Marinelli: synthesizer programming\nDavid Williams: guitar\nJerry Hey, Gary Grant: trumpets, flugelhorns\nLarry Williams: saxophone, flute\nBill Reichenbach: trombone\nVocal, rhythm and synthesizer arrangement by Rod Temperton\nHorn arrangement by Jerry Hey\nEffects b y Bruce Cannon and Bruce Swedien\nFeaturing: Narration by Vincent Price (Not featured on original edited single version)\nCharts[edit]\nWeekly charts[edit]\n Chart (1983–1985)\nPeakposition\nAustralia (Kent Music Report)[51]\n4\nBelgium (Ultratop 50 Flanders)[52]\n1\nCanadian RPM Top Singles[53]\n3\nFinland (Suome n virallinen singlelista)[54]\n7\nFinland Jukebox (Suomen virallinen singlelista)[54]\n3\nFrance (SNEP)[28]\n1\nIreland (IRMA)[55]\n4\nNetherlands (Dutch Top 40)[56]\n3\nNetherlands (Single Top 100)[57]\n4\nNew Zealand (Recorded Music NZ)[58]\n6\nPortugal (AFP)[59]\n1\nSouth Africa (Springbok)[60]\n26\nSpain (AFY VE)[61]\n1\nUK Singles (OCC)[27]\n10\nUS Cashbox[62]\n4\nUS Billboard Hot 100[63]\n4\nUS Billboard Hot Black Singles[64][26]\n3\nUS Billboard Adult Contemporary[65]\n24\nUS Billboard Album Rock Tracks[64][26]\n42\nUS Radio & Records CHR/Pop Airplay Chart[66]\n1\nWest Germany (Official German Charts)[67]\n9\nChart (2006)\nPeakposition\nFrance (SNEP)[68]\n35\nGermany (Media Control Charts)[35]\n9\nIreland (IRMA)[55]\n8\nItaly (FIMI)[69]\n5\nNetherlands (Single Top 100)[57]\n34\nSpain (PROMUSICAE)[35]\n1\nSwitzerland (Schweizer Hitparade)[35]\n3\nChart (2007)\nPeakposition\nSpain (PROMUSICAE)[70]\n20\nUK Singles (OCC)[27]\n57\nChart (2008)\nPeakposition\nAustria (Ö3 Austria Top 40)[71]\n55\nNorway (VG-lista)[72]\n13\nSwitzerland (Schweizer Hitparade)[73]\n53\nUK Singles (OCC)[27]\n35\nChart (2009)\nPeakposition\nAustralia (ARIA)[74]\n3\nAustria (Ö3 Austria Top 40)[71]\n5\nBelgium (Ultratop 50 Back Catalogue Singles Flanders)[75]\n3\n', 'name': 'search_knowledge_base'}, {'role': 'user', 'content': 'Retrieved document is not sufficient or relevant to answer the query. Reformulate the query to search knowledge base again.'}] ***Error: 422 Client Error: Unprocessable Entity for url: http://:8085/v1/chat/completions
Template error: unknown test: test iterable is unknown (in:99)
Expected behavior
Expected output should be the tgi-gaudi should be able to process many rounds of user/assistant (with tool calls)/tool/user messages with llama3.1-70B-instruct model