spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/index.html
Apache License 2.0
2.47k stars 585 forks source link

GPT 4 Vision Support #659

Open Mr-LiuDC opened 2 months ago

Mr-LiuDC commented 2 months ago

springAiVersion: 0.8.1

This is an example I saw here, but based on my testing, it seems that the gpt-4-vision-preview model is not yet supported.

Spring AI - Multimodality - Orbis Sensualium Pictus

Map<?, ?> aiVision() {
    var userMessage = new UserMessage("图片中有些什么?",
            List.of(new Media(MimeTypeUtils.IMAGE_PNG, "https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png"))
    );

    ChatResponse response = chatClient.call(new Prompt(List.of(userMessage),
            OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_VISION_PREVIEW.getValue()).build()));
    return Map.of("result", response);
}
ThomasVitale commented 1 month ago

Multimodality was not part of Spring AI 0.8.1. You can try it out using version 1.0.0-SNAPSHOT. I have an example here: https://github.com/ThomasVitale/llm-apps-java-spring-ai/tree/main/02-prompts/prompts-multimodality-openai.

OpenAI is now supporting multimodality with vision using the gpt-4-turbo model. The gpt-4-vision-preview model was a preview and it's not recommended anymore (see: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4). Still, the example works with both the gpt-4-vision-preview and with the gpt-4-turbo models.