spring-projects / spring-ai

An Application Framework for AI Engineering
https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/index.html
Apache License 2.0
2.6k stars 629 forks source link

Outdated/Deprecated Example in OpenAI Multimodal Documentation #1012

Open AshwinKrishnaK opened 2 weeks ago

AshwinKrishnaK commented 2 weeks ago

The current documentation for the OpenAI Multimodal example is outdated and uses deprecated methods in the latest version. Specifically, the example provided is as follows:

byte[] imageData = new ClassPathResource("/multimodal.test.png").getContentAsByteArray();

var userMessage = new UserMessage("Explain what do you see on this picture?",
        List.of(new Media(MimeTypeUtils.IMAGE_PNG, imageData)));

ChatResponse response = chatModel.call(new Prompt(List.of(userMessage),
        OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_VISION_PREVIEW.getValue()).build()));
var userMessage = new UserMessage("Explain what do you see on this picture?",
        List.of(new Media(MimeTypeUtils.IMAGE_PNG,
                "https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png")));

ChatResponse response = chatModel.call(new Prompt(List.of(userMessage),
        OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_O.getValue()).build()));

Issues:

  1. The method of passing imageData as byte[] is deprecated.
  2. The chatModel.call method used in the example is also deprecated.

This outdated example may cause confusion and errors for users trying to implement multimodal functionalities using the latest version of the OpenAI API.

AshwinKrishnaK commented 2 weeks ago

Updated Examples:

var image = new ClassPathResource("/multimodal.test.png").getURL();
var userMessage = new UserMessage(prompt,
               List.of(new Media(MimeTypeUtils.IMAGE_JPEG, image)));
ChatResponse response = chatModel.prompt(new Prompt(userMessage, OpenAiChatOptions.builder()
                .withModel(OpenAiApi.ChatModel.GPT_4_VISION_PREVIEW)
                .build()))
                .call()
                .chatResponse();
var userMessage = new UserMessage("Explain what do you see on this picture?",
                List.of(new Media(MimeTypeUtils.IMAGE_PNG,
                         new URL("https://docs.spring.io/spring-ai/reference/1.0-SNAPSHOT/_images/multimodal.test.png"))));
ChatResponse response = chatClient.prompt(new Prompt(userMessage,
                OpenAiChatOptions.builder().withModel(OpenAiApi.ChatModel.GPT_4_O).build()))
                .call()
                .chatResponse();

Could you please review the updated example and confirm if it is correct? If it is, I would be happy to raise a PR to update the documentation.