zhanghx0905 commented 3 months ago

When sending requests to a locally deployed GLM4v model using Xinference, an error is encountered:

LLM (Large Language Model) error, Please check your key or base_url, or network.

Upon investigation, it was identified that the request body construction for openai messages is incorrect. The image_url.url field should include a format hint "data:image/png;base64," to properly encode the image data.

GeneralAgent: 0.3.13
gptpdf: 0.0.7

Steps to Reproduce

Deploy GLM4v using Xinference locally.
Run the example
```
from gptpdf import parse_pdf
```

pdf_path = "./attention_is_all_you_need.pdf" output_dir = "./attention_is_all_you_need/"

Use OPENAI_API_KEY and OPENAI_API_BASE from environment variables

content, image_paths = parse_pdf( pdf_path, output_dir=output_dir, model="glm-4v", verbose=True, api_key="KEY", base_url="URL", ) print(content) print(image_paths)

zhanghx0905 commented 3 months ago

https://github.com/CosmosShadow/GeneralAgent/blob/3fbe8fa0118f53d3115eddc08b13bb780240a542/GeneralAgent/skills/llm_inference.py#L118

That's the root of the problem. One simple solution is to change the name of the locally deployed glm4-v model.

https://github.com/CosmosShadow/GeneralAgent/pull/7