LaurentMazare / diffusers-rs

An implementation of the diffusers api in Rust
Apache License 2.0
521 stars 54 forks source link

Tracking issue for SD ecosystem feature parity #69

Open Keavon opened 1 year ago

Keavon commented 1 year ago

The intention for this issue is to provide a comprehensive outline of all the core features and capabilities other distributions of Stable Diffusion (primarily A1111) provide. It's a big list, but not all are nearly as high priority as others. Some items in this outline will be turned into GitHub issues for discussing and tracking progress on implementation. Please comment on this issue to suggest additions, clarifications, and sub-features and I'll aim to keep the outline up to date.

Generation methods

Generation parameters

Model support

Stylization

ControlNet

Some features are described at https://github.com/Mikubill/sd-webui-controlnet but I don't currently have time to make a list of them. Help with such a list would be appreciated.

Optimizations

VRAM reduction strategies, things like xformers and floating point precision? I don't understand this stuff enough to really get it. Also other methods will remove certain parts of the pipeline from VRAM after that stage has been completed which trades time for VRAM requirements. I'll need help creating a list of out this.

Upscaling

Some upscalers are entirely separate models and are thus likely out of scope. Other upscalers, I think, are part of the SD pipeline. Some are scripts, but I think others are actual models which require being implemented in the actual pipeline? Those ones should probably be included here, but I need help creating a list.

Sampling methods

Other models

Did I miss something? Probably! Hopefully the community can help me keep this list updated so it's as comprehensive as possible. Thanks ❤️.

Ideally these capabilities would be modular, allowing for composability and opting in and out of specific features at will for any desired image generation pipeline. In our use case with Graphite, we want to put different options into nodes within a node graph so they are user-configurable. (I should also mention that keeping the MIT/Apache 2.0 license is important for Graphite, since our project is also Apache 2.0, so I'd humbly request that some care be taken to not copy from copyleft code which would force this library to change its license, thanks 😃).

ninjasaid2k commented 1 year ago

Some features are described at https://github.com/Mikubill/sd-webui-controlnet but I don't currently have time to make a list of them. Help with such a list would be appreciated.

Control Type

Preprocessors

katopz commented 1 year ago

Nice! fyi ControlNet canny is supported here.

LaurentMazare commented 1 year ago

[ ] Viewing the image generation progress as it runs (this is very high priority for Graphite)

@Keavon do you mean that you would want the intermediary images to be available or something else? For the intermediary images, this should already be doable (and available in the command line examples), see for example this snippet.

Keavon commented 1 year ago

Yes, viewing the intermediary images while waiting for the final image to be completed. Good to know that's already supported, thanks! Feel free to check off any others that are in my list and already supported, too. Thank you!

LaurentMazare commented 1 year ago

Just to mention that I didn't have the time to do much on all these features, one thing though that I've been working on is a new ML framework written in rust called candle. As an example this includes stable diffusion 1.5 and 2.1 but only text2img at the moment, if there is some interest I can add more there and make a full crate out of this example. The main upside compared to this crate is that there is no dependency on libtorch anymore so deployment is a lot faster, it could run on wasm etc (main downside is that it might not be as optimized as the libtorch version yet but we're working on it).

Keavon commented 1 year ago

I was looking at both Candle as well as Burn (which has recently had @Gadersd port both SD 1.4 in Burn and SDXL in Burn) and it definitely looks like one of those frameworks is the path forward in the Rust ecosystem (although I'm curious what their differences are).

I'd really love to help organize interested contributors into a team building a robust, production-ready, pure-Rust Stable Diffusion distro that aims to be as fully-featured as AUTOMATIC1111. I wonder if you have any thoughts or suggestions about that, @LaurentMazare. Likewise @Gadersd was interested in the idea but I should reach out again about next steps.