jina-ai / dalle-flow

🌊 A Human-in-the-Loop workflow for creating HD images from text

grpcs://dalle-flow.dev.jina.ai

2.83k stars 211 forks source link

Weighted subprompts, 25% SD performance boost, less VRAM usage, and more #112

Closed AmericanPresidentJimmyCarter closed 2 years ago

AmericanPresidentJimmyCarter commented 2 years ago

Add positively or negatively weighted subprompts (Closes #103). You can test these for example with "tabby cat: 0.75, tiger: 0.25".
Uses a new stable-diffusion repo with much better memory management and performance updates (Closes #110)
The latent representation and conditioned embeddings are now stored, along with the parameters for the API call (Closes #104)
Adds the ability to use the SD concepts library by simply supplying the name of the concept in brackets (Closes #111). You can test this with a prompt like "".

Weighted prompts/subprompts and SD concepts are only available in the main stable branch. No plans currently to support them in stablelite, which is considered a barebones implementation.

venetanji commented 2 years ago

@AmericanPresidentJimmyCarter your fork works great thanks for doing this. I couldn't increase the batch size (n_samples > 1 in executors/stable/config.yml) though, getting some errors about the tensor sizes not matching.

/dalle/dalle-flow/executors/stable/executor.py:1… │                    
       │ in forward                                        │                    
       │                                                   │                    
       │   100 │   │   '''                                 │                    
       │   101 │   │   uncond_count = uncond.size(dim=0)   │                    
       │   102 │   │   cond_count = cond.size(dim=0)       │                    
       │ ❱ 103 │   │   cond_in = torch.cat((uncond, cond)) │                    
       │   104 │   │   del uncond, cond                    │                    
       │   105 │   │   cond_arities_tensor = torch.tensor( │                    
       │   106 │   │   if use_half and (x.dtype == torch.f │                    
       ╰───────────────────────────────────────────────────╯                    
       RuntimeError: Sizes of tensors must match except in                      
       dimension 0. Expected size 77 but got size 154 for                       
       tensor number 1 in the list.

AmericanPresidentJimmyCarter commented 2 years ago

I will look into it -- originally this was a bug that was force push fixed but if you're on the latest and it is still happening there must be a regression.

venetanji commented 2 years ago

I got it working by passing batch size to split_weighted_subprompts_and_return_cond_latents and changing the dimensions of the tile operation to (batch_size, 1, 1). Not entirely sure if I'm doing it right, but works fine now.

AmericanPresidentJimmyCarter commented 2 years ago

The unconditioned prompt dimension size 0 (unconditioned_prompt.size()[0]) should be the same size as the batch size unless something has gone wrong. I will look into it.

venetanji commented 2 years ago

I think tile is just missing a dimension, should be ((unconditioned_prompt.size()[0], 1, 1)

AmericanPresidentJimmyCarter commented 2 years ago

@venetanji You were correct, the bug has been fixed.

@delgermurun The stable_diffusion module has been properly made into a stable_inference module and the code for the executor has been dramatically simplified and made easier to read.

AmericanPresidentJimmyCarter commented 2 years ago

@delgermurun As discussed on slack, the lite executor has been removed and the README updated.

AmericanPresidentJimmyCarter commented 2 years ago

105 done now too.

AmericanPresidentJimmyCarter commented 2 years ago

@samsja @delgermurun Should be good to go.

samsja commented 2 years ago

great works @AmericanPresidentJimmyCarter we really appreciate your effort