If possible, please add to the Readme, a short 1-2 sentence description, for each item under: of CoreComponents, Workload, Recommended Checkpoints, etc.

pmeaney commented 8 months ago

In the Image Diffusion configuration screen, there are several options within different categories. Categories such as:

Core Components
Workloads
Recommended Checkpoints
Upscalers
Control Extensions.

I have never worked with this sort of tool before, so I do not know what each of the items within those 5 categories entails, let alone each item categorized within them (which is the level of specificity about which I seek knowledge).

I would greatly appreciate it if someone with knowledge of them could add 1-2 sentences of what each item within each category does, or how it contributes to image diffusion.

I am interested to know whether certain ones are worth installing for my use case-- For example, my use case is:

Turn a sketch (similar to the 3 Owls example) into a realistic image.

In my case, it's a Kiosk for an in-person event. The kiosk contains two mounted TV monitors, with a table between them holding a laptop, ipad, and camera equipment. So, I am interested in knowing which items I need to install. Hence I believe a brief description of each item within the above 5 categories would be very useful.

An example of knowledge sought: As mentioned above, I intend to "scribble" some "line art"-- i.e., to sketch an image of an interactive event kiosk, and have the image enhanced from a mere sketch to a photo-realistic image, by the pertinent algorithm-- in a use case very similar to the 3 owls example currently shown in the Readme. Then, in this case, do I need to install the "ControlNet Scribble" and/or "ControlNet Line Art" and/or other items from the 5 main categories?

This project and your knowledge are much appreciated. Thanks!

Acly commented 8 months ago

Then, in this case, do I need to install the "ControlNet Scribble" and/or "ControlNet Line Art" and/or other items from the 5 main categories?

All the options are stand-alone, install what you need. Regarding scribble vs line art: try both!

There is a lot of information out there, searching any of the terms on the internet will give you an idea in most cases. Which is not to say that it wouldn't be very nice to have this written up as a concise overview with some images! but it takes time...

pmeaney commented 8 months ago

Fantastic, thanks Acly! I will delve into some reading on the Krita ai diffusion plugin and the particular items listed. I only just installed it yesterday, so I am still coming up to speed.

Grant-CP commented 8 months ago

@pmeaney My personal favorite to put in front of people is scribble controlnet with an ending step of .3 or .5. The early ending step for controlnets is an option you can enable in the Krita extension.

The general structure of the image defined early on in the generation process so generally it's a better to let the model take over (no more controlnet) to fill out the details in the last parts of generation. If you are putting it front of skilled artists who can actually sketch then keeping the controlnet going the whole way is fine too!

A great place to get a basic idea of controlnets is https://stable-diffusion-art.com/controlnet/ You will want to scroll down to the "Preprocessors and models tab" since Acly has already done the work of making installing easy. Also you won't need to choose a preprocessor since Acly has already chosen good ones for the extension.

pmeaney commented 8 months ago

Thanks @Grant-CP

In my initial example, I was attempting a terrible sketch of a kiosk table with bilaterial TV's. (I didn't want to focus on a more polished sketch until I could see an end-to-end success example).

Looks like I ran into a RAM or storage constraint. I think it's because I went with a default of 10 gigs.

Or perhaps it's the issue of input vs output image (need 512x512 I think) as mentioned here: https://www.reddit.com/r/StableDiffusion/comments/x7krjz/comment/indgph6/?utm_source=share&utm_medium=web2x&context=3

I might try this solution next time I try again:

Got it working for me on 16GB M1 Macbook Pro with --n_samples 1 --n_iter 1`

Grant-CP commented 8 months ago

I wonder if you are trying to generate a very large image maybe? I think it is best to make sure you start with a 512x512 canvas in Krita (not sure if that’s what you have already).

Not that this is the issue you are having, but right now I think you are in img2img mode (with 100% denoising strength, meaning your input doesn’t matter), and not using a controlnet (you probably want the scribble controlnet here). You should watch a video online for an overview of the basic options. I know nerdyrodent has one on an earlier version of the krita extension.

For most errors it’s best to look at comfyui.log which is in your comfyui folder. That will provide better information for someone to help you with. Honestly I don’t really have a clue went wrong here.

On Jan 9, 2024, at 1:31 PM, pmeaney @.***> wrote:

Thanks @Grant-CP https://github.com/Grant-CP In my initial example, I was attempting a terrible sketch of a kiosk table with bilaterial TV's. I wanted to see what might be generated. Looks like I ran into a RAM or storage constraint. I think it's because I went with a default of 10 gigs.

Or perhaps it's the issue of input vs output image (need 512x512 I think) as mentioned here: https://www.reddit.com/r/StableDiffusion/comments/x7krjz/comment/indgph6/?utm_source=share&utm_medium=web2x&context=3

Screenshot.2024-01-07.at.11.00.03.PM.png (view on web) https://github.com/Acly/krita-ai-diffusion/assets/6299810/33aa974d-73a4-44b8-a54d-f29dff929de4 — Reply to this email directly, view it on GitHub https://github.com/Acly/krita-ai-diffusion/issues/315#issuecomment-1883828586, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATSQ2YVL7RZVKUCZUEOJ5Z3YNWZKTAVCNFSM6AAAAABBQW3YDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBTHAZDQNJYGY. You are receiving this because you were mentioned.

pmeaney commented 8 months ago

@Grant-CP Thanks! You were correct about the 512x512 pixel size image.

I also installed ControlNet LineArt & ControlNet Scribble, as an initial test-run. I was able to get results:

Seed image

Attempt 1:

Prompt: two TV monitors flanking a table, with a webcam on top of each TV monitor

Attempt 2:

Prompt: An event kiosk with two TV monitors on stands, with A webcam on top of each TV monitor, and a table in the middle between the TV monitors

Attempt 3:

Prompt: At the location of an indoor industry conference event, the image is a kiosk with two five foot TVs on stands and a table in the middle between the TV monitors

Attempt 4:

Prompt: At the location of an outdoor festival, the image consists of two five foot TVs on stands and a display table in the directly between the TV monitors

Attempt 5:

Prompt: At a city field, the image consists of two five foot TVs with a display table, streaming a stand-up comedy show

Attempt 6:

Prompt: At a city park, during a festival, the image consists of two five foot TVs with a display table, streaming a stand-up comedy show

Grant-CP commented 8 months ago

Great to see you got it started!

First off, I’m not sure that your drawing actually matters at all with your current settings. The “strength” part at the right is called the “denoising strength”. You should look it up, and you will want a value less than 100%. 100% is fine if you have a controlnet enabled, but with 100% and no controlnets basically you are telling the model to ignore 100% of the pixels in your drawn image.

Also, to use the controlnets (I recommend scribble), you need to click the button to the right of the Strength meter. The options there should be pretty self-explanatory.

I highly recommended watching one or two stable diffusion videos on Youtube. They will talk about most of the options that are also available via this Krita extension.

-Grant

On Jan 11, 2024, at 7:12 AM, pmeaney @.***> wrote:

@Grant-CP https://github.com/Grant-CP Thanks! You were correct about the 512x512 pixel size image.

I also installed ControlNet LineArt & ControlNet Scribble, as an initial test-run. I was able to get results:

Seed image

seed-image.png (view on web) https://github.com/Acly/krita-ai-diffusion/assets/6299810/2d0658a0-88a6-48db-9c6b-402a1542e5a2 Attempt 1:

Prompt: An two TV monitors flanking a table, with a webcam on top of each TV monitor Image example: try1-first-generated-image.png (view on web) https://github.com/Acly/krita-ai-diffusion/assets/6299810/88d0d508-dfe4-4aeb-8d12-22e7831a6b38 Attempt 2:

Prompt: An event kiosk with two TV monitors on stands, with A webcam on top of each TV monitor, and a table in the middle between the TV monitors Image example: try2-good-match.png (view on web) https://github.com/Acly/krita-ai-diffusion/assets/6299810/2485eba2-eae5-480e-9118-786cc82f98cc Attempt 3:

Prompt: At the location of an indoor industry conference event, the image is a kiosk with two five foot TVs on stands and a table in the middle between the TV monitors Image example: — Reply to this email directly, view it on GitHub https://github.com/Acly/krita-ai-diffusion/issues/315#issuecomment-1887387677, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATSQ2YTV7JRNXHAZ6BGYPITYN76OTAVCNFSM6AAAAABBQW3YDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBXGM4DONRXG4. You are receiving this because you were mentioned.

pmeaney commented 8 months ago

@Grant-CP Thanks a ton, you are super helpful! I will look into those recommendations and continue exploring the tool!

I am documenting the adventure of learning Krita + the Acly AI Diffusion tool here: https://docusaurus-blog-j24.vercel.app/docs/category/krita-and-ai-powered-dynamic-image-generation

So, I will take note and update my exploration docs.

I see what you mean about the strength of the layers.

Anyway-- I'll spend more time learning about the configuration items. I might even write up the 1-2 sentence summaries and provide them here.

Acly / krita-ai-diffusion