Illustrated Story Mode - Githubissues

rosscado commented 5 months ago

Use Case: As a parent, I want Pi to tell my child a bedtime story. I want the story to have pictures.

Feature: Use realtime image generation to generate illustrations for a narrative story told by Pi.

A new image could be generated on each message from Pi.

Pi's story text could be passed directly to the image generation service, or more likely, preprocessed by a fast LLM (e.g. Llama3-70B) to generate an image prompt from the narrative text.

Example:

Performance Requirements Image generation should be as close to realtime as possible. However, there is scope for slower generations. It takes about 20-40s to read a message on average, so even a 10s image generation time might be acceptable.

rosscado commented 5 months ago

Image Models https://artificialanalysis.ai/text-to-image

fal-ai/fast-turbo-diffusion

Uses SDXL Turbo/v1.5
Very fast, ~100ms 🚀
Very cheap 🚀
Image quality and prompt following is low 👎
Safety is unknown

rosscado commented 5 months ago

fal-ai/fast-sdxl

Uses SDXL
Fast, ~2.5s 🔥
Cheap
Image quality and prompt following is moderate 😐
Safety is high

rosscado commented 5 months ago

OpenAI DALL.E-3 https://platform.openai.com/docs/guides/images/usage

Use DALL.E-3 SD (for speed and cost vs HD)
Slow, ~8-20s? 🐌
Expensive, 4c per image. Not prohibitive, but would need to be a premium feature. 💳
Image quality and prompt following are excellent 👌🏻
Safety is very high

Pedal-Intelligence / saypi-userscript

Illustrated Story Mode #83