Pedal-Intelligence / saypi-userscript

An independent voice interface for Inflection AI's conversational assistant, Pi
https://www.saypi.ai/
Other
15 stars 4 forks source link

Illustrated Story Mode #83

Open rosscado opened 5 months ago

rosscado commented 5 months ago

Use Case: As a parent, I want Pi to tell my child a bedtime story. I want the story to have pictures.

Feature: Use realtime image generation to generate illustrations for a narrative story told by Pi.

A new image could be generated on each message from Pi.

Pi's story text could be passed directly to the image generation service, or more likely, preprocessed by a fast LLM (e.g. Llama3-70B) to generate an image prompt from the narrative text.

Example:

Screenshot 2024-06-10 at 10 55 05 fairytale

Performance Requirements Image generation should be as close to realtime as possible. However, there is scope for slower generations. It takes about 20-40s to read a message on average, so even a 10s image generation time might be acceptable.

rosscado commented 5 months ago

Image Models https://artificialanalysis.ai/text-to-image

fal-ai/fast-turbo-diffusion

rosscado commented 5 months ago

fal-ai/fast-sdxl

image

rosscado commented 5 months ago

OpenAI DALL.E-3 https://platform.openai.com/docs/guides/images/usage