This would be super cool, but not something we're gonna focus on immediately. Adding here in case someone wants to take a stab.
It would be great to optionally generate an image to go along with journal entries. Stable Diffusion, MidJourney, DALLE-2. If tied to DALLE-2, we'll just piggy-back on the premium OpenAI setup. For Stable Diffusion, we'll follow the same system as #160 - BYO model (IP/ngrok) and we'll host a first-come/first-serve BYO of our own.
The tricky part is the "oddities" around these outputs. I tried running a few of my dreams through DALLE-2, hoping we could easily implement this feature - especially for dream visualization. The output was morbid and grotesque. Very displeasing, especially for someone who just wants to give it a spin (as opposed to knows what they're up against).
So one solution is to first take an entry, pipe it through the LLM (BYO #160 or GPT) in order to craft a safe and focused Stable Diffusion prompt. We'll need a good prompt for creating the prompt, lol - something like "using stable diffusion's prompt syntax, reconstruct the following journal entry to be visualized as an image. Here is a sample prompt for stable diffusion: ".
Anyway, just kicking off the convo here in case others have feedback / ideas. I'm not gonna work on this personally for some time, unless it's a really desired feature.
[ ] Find a solid one-size-fits-all model. I'm thinking Gnothi uses DALLE-2, BYO leans Stable Diffusion (with a setup guide for getting the IP/URL in via webhook).
[ ] Create a journal->gpt_prompt->image_prompt pipeline. Either write one, or use something like PromptPerfect
This would be super cool, but not something we're gonna focus on immediately. Adding here in case someone wants to take a stab.
It would be great to optionally generate an image to go along with journal entries. Stable Diffusion, MidJourney, DALLE-2. If tied to DALLE-2, we'll just piggy-back on the premium OpenAI setup. For Stable Diffusion, we'll follow the same system as #160 - BYO model (IP/ngrok) and we'll host a first-come/first-serve BYO of our own.
The tricky part is the "oddities" around these outputs. I tried running a few of my dreams through DALLE-2, hoping we could easily implement this feature - especially for dream visualization. The output was morbid and grotesque. Very displeasing, especially for someone who just wants to give it a spin (as opposed to knows what they're up against).
So one solution is to first take an entry, pipe it through the LLM (BYO #160 or GPT) in order to craft a safe and focused Stable Diffusion prompt. We'll need a good prompt for creating the prompt, lol - something like "using stable diffusion's prompt syntax, reconstruct the following journal entry to be visualized as an image. Here is a sample prompt for stable diffusion:".
Anyway, just kicking off the convo here in case others have feedback / ideas. I'm not gonna work on this personally for some time, unless it's a really desired feature.