carlrobertoh / llm-client

User-friendly Java HTTP client that provides access to large language model APIs and services
https://central.sonatype.com/artifact/ee.carlrobert/llm-client
MIT License
21 stars 29 forks source link

Support for Image generation in DallE #41

Open moritzfl opened 2 months ago

moritzfl commented 2 months ago

Add support for image generation through DallE. This would reuse most of the settings for Azure and OpenAI but additionally add a Dalle-Model name and settings for the generated images (e.g. size).

The following is just to illustrate the simplicity of actually requesting an image. This is not code that I'd directly put into llm-client (as it is "stringly"-typed and does not pull settings from the usual configuration objects in llm-client).

                String url = String.format("https://%s.openai.azure.com/openai/deployments/%s/images/generations?api-version=%s",
                        "deployment-name",
                        "dalle3",
                        "2024-02-01");

                String jsonPayload = String.format("""
                {
                    "prompt": "%s",
                    "n": 1,
                    "size": "1024x1024"
                }
                """, "prompt for Image generation (i.e. image of a cat)");

                // Build the HttpRequest
                HttpRequest request = HttpRequest.newBuilder()
                        .uri(URI.create(url))
                        .header("Content-Type", "application/json")
                        .header("api-key", "your personal api key for Azure")
                        .POST(HttpRequest.BodyPublishers.ofString(jsonPayload))
                        .build();

I am willing to work on this feature if you think it would fit the llm-client project.

However, I am unsure how to best integrate image generation capabilities into CodeGPT. Image generation models could be separate from the models that are used for the text responses so perhaps it would make sense to configure settings separatly for image generation models in CodeGPT.

In the future, one might even use image generation that is not at all tied to a LLM-Model-Provider, like StableDiffusion.

I would also gladly hand over the CodeGPT-Implementation to someone else ;)

Anyways - API support needs to be implemented first before taking care of the UI stuff in CodeGPT.