metagpt.utils.text / generate_prompt_chunk() / Calculation for max token length is flawed because LLM model name could actually hold the DEPLOYMENT model name. #1225
metagpt.utils.text / generate_prompt_chunk() / Calculation for max token length is flawed because LLM model name could actually hold the DEPLOYMENT model name.
The calculation for max token length, i.e.:
...
reserved = reserved + count_string_tokens(prompt_template + system_text, model_name)
# 100 is a magic number to ensure the maximum context length is not exceeded
max_token = TOKEN_MAX.get(model_name, 2048) - reserved - 100
...
is flawed on a number of counts:
It does not check for a +ve value. A negative value should raise an exception. If a negative value is derived for max_token the subsequent while loop recursively grows paragraphs with empty lines.
It assumes that the model_name is always going to be an LLM model name, but this could also be a DEPLOYMENT model name in the case of an Azure OpenAI Service deployment. Because the LLM deployment name could be anything, TOKEN_MATCH will fail to match a valid model name and return the default value - resulting in a negative value.
Bug solved method
In order to resolve this issue need to distinguish between the model name and deployment name in the LLMConfig. One refers to the type of model being used, whereas the other refers to an instance / deployment of that type.
Environment information
LLM type and model name: Azure OpenAI Service gpt-35-turbo
Bug description
metagpt.utils.text / generate_prompt_chunk() / Calculation for max token length is flawed because LLM model name could actually hold the DEPLOYMENT model name.
The calculation for max token length, i.e.:
is flawed on a number of counts:
Bug solved method
In order to resolve this issue need to distinguish between the model name and deployment name in the LLMConfig. One refers to the type of model being used, whereas the other refers to an instance / deployment of that type.
Environment information
poetry = "~1.7.0" python = ">=3.9,<3.13" aiohttp = "3.8.6" channels = "4.0.0" faiss_cpu = "1.7.4" fire = "0.4.0" typer = "0.9.0" lancedb = "0.4.0" loguru = "0.6.0" meilisearch = "0.21.0" numpy = "^1" openai = "1.6.1" openpyxl = "3.1.2" beautifulsoup4 = "4.12.3" pandas = "2.1.1" pydantic = "2.5.3" python_docx = "0.8.11" PyYAML = "6.0.1" setuptools = "65.6.3" tenacity = "8.2.3" tiktoken = "0.6.0" tqdm = "4.66.2" anthropic = "0.18.1" typing-inspect = "0.8.0" libcst = "1.0.1" qdrant-client = "1.7.0" ta = "0.10.2" semantic-kernel = "0.4.3.dev0" wrapt = "1.15.0" aioredis = "~2.0.1" websocket-client = "1.6.2" aiofiles = "23.2.1" gitpython = "3.1.40" zhipuai = "2.0.1" rich = "13.6.0" nbclient = "0.9.0" nbformat = "5.9.2" ipython = "8.17.2" ipykernel = "6.27.1" scikit_learn = "1.3.2" typing-extensions = "4.9.0" socksio = "~1.0.0" gitignore-parser = "0.1.9" websockets = "~11.0" networkx = "~3.2.1" google-generativeai = "0.4.1" playwright = ">=1.26" anytree = "2.12.1" ipywidgets = "8.1.1" Pillow = "10.3.0" imap_tools = "1.5.0" qianfan = "0.3.2" dashscope = "1.14.1" rank-bm25 = "0.2.2" gymnasium = "0.29.1" jieba = "0.42.1" beautifulsoup4 = "~4.12.3" dependency-injector = "~4.41.0" duckduckgo_search = "~5.3.0" google-api-python-client = "~2.127.0" playwright = "~1.43.0" selenium = "~4.19.0" webdriver-manager = "~4.0.1"
Screenshots or logs
N/A