metagpt.utils.text / generate_prompt_chunk() / Calculation for max token length is flawed because LLM model name could actually hold the DEPLOYMENT model name.

Bug description

The calculation for max token length, i.e.:

...
    reserved = reserved + count_string_tokens(prompt_template + system_text, model_name)
    # 100 is a magic number to ensure the maximum context length is not exceeded
    max_token = TOKEN_MAX.get(model_name, 2048) - reserved - 100
...

is flawed on a number of counts:

It does not check for a +ve value. A negative value should raise an exception. If a negative value is derived for max_token the subsequent while loop recursively grows paragraphs with empty lines.
It assumes that the model_name is always going to be an LLM model name, but this could also be a DEPLOYMENT model name in the case of an Azure OpenAI Service deployment. Because the LLM deployment name could be anything, TOKEN_MATCH will fail to match a valid model name and return the default value - resulting in a negative value.

Bug solved method

In order to resolve this issue need to distinguish between the model name and deployment name in the LLMConfig. One refers to the type of model being used, whereas the other refers to an instance / deployment of that type.

Environment information

LLM type and model name: Azure OpenAI Service gpt-35-turbo
System version: Windows 11
Python version: 3.10.11
MetaGPT version or branch: 0.8.1

packages version:

poetry = "~1.7.0" python = ">=3.9,<3.13" aiohttp = "3.8.6" channels = "4.0.0" faiss_cpu = "1.7.4" fire = "0.4.0" typer = "0.9.0" lancedb = "0.4.0" loguru = "0.6.0" meilisearch = "0.21.0" numpy = "^1" openai = "1.6.1" openpyxl = "3.1.2" beautifulsoup4 = "4.12.3" pandas = "2.1.1" pydantic = "2.5.3" python_docx = "0.8.11" PyYAML = "6.0.1" setuptools = "65.6.3" tenacity = "8.2.3" tiktoken = "0.6.0" tqdm = "4.66.2" anthropic = "0.18.1" typing-inspect = "0.8.0" libcst = "1.0.1" qdrant-client = "1.7.0" ta = "0.10.2" semantic-kernel = "0.4.3.dev0" wrapt = "1.15.0" aioredis = "~2.0.1" websocket-client = "1.6.2" aiofiles = "23.2.1" gitpython = "3.1.40" zhipuai = "2.0.1" rich = "13.6.0" nbclient = "0.9.0" nbformat = "5.9.2" ipython = "8.17.2" ipykernel = "6.27.1" scikit_learn = "1.3.2" typing-extensions = "4.9.0" socksio = "~1.0.0" gitignore-parser = "0.1.9" websockets = "~11.0" networkx = "~3.2.1" google-generativeai = "0.4.1" playwright = ">=1.26" anytree = "2.12.1" ipywidgets = "8.1.1" Pillow = "10.3.0" imap_tools = "1.5.0" qianfan = "0.3.2" dashscope = "1.14.1" rank-bm25 = "0.2.2" gymnasium = "0.29.1" jieba = "0.42.1" beautifulsoup4 = "~4.12.3" dependency-injector = "~4.41.0" duckduckgo_search = "~5.3.0" google-api-python-client = "~2.127.0" playwright = "~1.43.0" selenium = "~4.19.0" webdriver-manager = "~4.0.1"

installation method: poetry

Screenshots or logs

N/A

geekan / MetaGPT

metagpt.utils.text / generate_prompt_chunk() / Calculation for max token length is flawed because LLM model name could actually hold the DEPLOYMENT model name. #1225