OpenBMB / RepoAgent

An LLM-powered repository agent designed to assist developers and teams in generating documentation and understanding repositories quickly.
Apache License 2.0
212 stars 28 forks source link

KeyError: `default_completion_kwargs` raisd in ai_doc\chat_engine.py #17

Closed Umpire2018 closed 6 months ago

Umpire2018 commented 6 months ago

Description:

Encountered a KeyError when accessing default_completion_kwargs in chat_engine.py.

Code Snippet:

model = self.config["default_completion_kwargs"]["model"]

Error Message:

  File "ai_doc\chat_engine.py", line 103, in generate_doc
    model = self.config["default_completion_kwargs"]["model"]
            ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'default_completion_kwargs'

Expected Behavior:

The default_completion_kwargs key should be present in the config dictionary.

Additional Improvement:

Propose to include a new function find_engine_or_model to efficiently search for 'engine' or 'model' keys in nested dictionaries. The function returns the first occurrence of either key.

def find_engine_or_model(data):
    for first_level_key, first_level_value in data['api_keys'].items():
        for item in first_level_value:
            if 'engine' in item:
                return item['engine']
            elif 'model' in item:
                return item['model']
    return None
Umpire2018 commented 6 months ago

Result:

# Test data
api_keys:
  gpt-3.5-turbo-16k:
    - api_key: 'sk-XXXX'
      base_url: 'https://example.com/v1/'
      api_type: 'azure'
      api_version: 'XXX'
      engine: 'GPT-35-Turbo-16k'
    - api_key: 'sk-xxxxx'
      organization: 'org-xxxxxx'
      model: 'gpt-3.5-turbo-16k'
  gpt-4:
    - api_key: 'sk-XXXX'
      base_url: 'https://example.com/v1/'
      model: 'gpt-4'
print(find_engine_or_model(data))
GPT-35-Turbo-16k
Umpire2018 commented 6 months ago

It seems like this issue have been fixed in https://github.com/LOGIC-10/AI_doc/commit/2e6925522bb935a357204db7156159324fce17b0 , but doesn't updated config.yml :(

Umpire2018 commented 6 months ago

Description:

I only set gpt-3.5-turbo in config.yml, but program used gpt-3.5-turbo-16k instead. Then it use gpt-3.5-turbo-16k to access config which raised error.

Error Message:

  File "C:\Users\Bangsheng.Feng\Music\arno\AI_doc\ai_doc\chat_engine.py", line 125, in generate_doc
    api_key=self.config["api_keys"][model][0]["api_key"],
            ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'gpt-3.5-turbo-16k'

Additional Improvement:

Propose to use original_model to record previous config, request openai with this config and modified_model.

original_model = ChatEngine.find_engine_or_model(data=self.config)
# 检查tokens长度
if self.num_tokens_from_string(sys_prompt) + self.num_tokens_from_string(usr_prompt) >= 3500:
  print("The code is too long, using gpt-3.5-turbo-16k to process it.")
  modified_model = "gpt-3.5-turbo-16k"
else:
  modified_model = original_model

client = OpenAI(
        api_key=self.config["api_keys"][original_model][0]["api_key"],
        base_url=self.config["api_keys"][original_model][0]["base_url"],
        timeout=self.config["default_completion_kwargs"]["request_timeout"]
    )

response = client.chat.completions.create(
      model=modified_model,
      messages=messages,
      temperature=self.config["default_completion_kwargs"]["temperature"],
  )
pooruss commented 6 months ago

Please try the latest config.yml:

api_keys:
  gpt-3.5-turbo-16k:
    - api_key: sk-XXXX
      base_url: https://example.com/v1/
      api_type: azure
      api_version: XXX
      engine: GPT-35-Turbo-16k
      # you can use any kwargs supported by openai.ChatCompletion here
    - api_key: sk-xxxxx
      organization: org-xxxxxx
      model: gpt-3.5-turbo-16k
  gpt-4:    
    - api_key: sk-XXXX
      base_url: https://example.com/v1/
      model: gpt-4

  gpt-4-32k:
    - api_key: sk-XXXX
      base_url: https://example.com/v1/
      api_type: XXX
      api_version: XXX
      engine: gpt4-32

default_completion_kwargs:
  model: gpt-4
  temperature: 0.2
  request_timeout: 60

repo_path: /path/to/your/repo
project_hierarchy: .project_hierarchy.json
Markdown_Docs_folder: /Markdown_Docs

language: zh
Umpire2018 commented 6 months ago

@pooruss Can you continue to work on https://github.com/LOGIC-10/AI_doc/issues/17#issuecomment-1865454922 ? I think there are a lot of things need to be dealt with. Such as

  1. Error handle of #18
  2. When i got KeyError, it throws GPT-3.5-turbo which is model name instead of error stack.
    except Exception as e:
                print(f"An error occurred: {e}. Attempt {attempt + 1} of {max_attempts}")
                # 等待10秒后重试
                time.sleep(10)
                if attempt + 1 == max_attempts:
                    raise

Possible solution: From GPT

根据您提供的代码和运行结果,出现 "An error occurred: 'gpt-3.5-turbo-16k'" 的原因可能是因为在 try 块中的某个地方抛出了一个异常,该异常的信息是 'gpt-3.5-turbo-16k'。这个异常可能不是 APIConnectionError 或 BadRequestError 类型,因此它被最后一个 except Exception as e 捕获。

具体原因可能有以下几种情况:

配置问题:可能是因为 self.config["api_keys"][model][0]["api_key"] 或 self.config["api_keys"][model][0]["base_url"] 等访问路径中的某个环节不存在或配置错误,导致抛出了一个与模型名称相关的异常。

代码逻辑问题:可能在尝试访问配置或执行请求时,由于某种原因(例如配置不正确或其他逻辑错误),代码抛出了一个异常,其消息恰好是模型名称字符串。

为了解决这个问题,您可以:

增加更详细的错误日志:在 except Exception as e 块中打印更多的错误信息,例如使用 traceback 模块来打印堆栈跟踪。这样可以帮助您更准确地定位问题所在。

检查配置的正确性:确保 self.config 字典中包含正确的键和值,并且这些键和值的结构符合您的代码逻辑。

审查代码逻辑:检查 try 块中的代码,特别是涉及到模型名称或其他配置的部分,确保没有逻辑错误或错误的假设。

下面是一个增加详细错误日志的例子:

import traceback

# ... 省略其他代码 ...

except Exception as e:
    print(f"An error occurred: {e}. Attempt {attempt + 1} of {max_attempts}")
    traceback.print_exc()  # 打印堆栈跟踪
    time.sleep(10)
    if attempt + 1 == max_attempts:
        raise