Closed ogallagher closed 3 months ago
So far, it does seem that there's no way to get gemini to generate a raster image; the furthest I got was an empty 1px square base-64 PNG data string and a description of how to do it with external image generator models.
I'm able to generate basic geometry in an SVG, but more complex shapes and real-world entities it so far cannot draw.
For example, when I asked it to, instead of a triangle, draw a fish, it gave the polygon the attribute id="fish"
, but did not change the geometry to resemble a fish. Other times for similar prompts, it changed the geometry but the shape looked nothing like a fish.
In conclusion, I do not believe we can use gemini to generate images, to confirm your doubt @hoanghm.
@ogallagher We can instead try to ask the Gemini API to describe the image it wants, search for and get a bunch of images from trusted sites, then send them back to the Gemini API to let it pick.
There is some uncertainty as to whether the Gemini API will be able to generate images on its own, so we need a proof of concept for this.
Raster image support (ex. JPG, PNG)Vector image supportOn failure of above decide on next steps
Craiyon web client. No API.
OpenAI DALL-E. Has API.
Stable Diffusion. Has API.
Midjourney [no API]
[x] Research how general Google AI chat generates images.
[x] Check Google imagen api intro tutorial.
keep trying w Gemini. I didn't yet try structured prompts instead of the chat interface. I also didn't try having Gemini describe a scene geometrically in detail, and then pass that description to the SVG request.
ask the Gemini API to describe the image it wants, search for and get candidate images from trusted sites, then send them back to the Gemini API to let it pick
skip image generation