Flux starts adding horizontal stripes to images around 2K resolution

LilithDragoness commented 2 months ago

It might be an upstream issue.

I'm using Forge with default settings, except for the resolution. However, I’ve tried most of the samplers and schedulers to fix the problem, but without success. What's particularly frustrating is that the issue randomly disappears once in a while for reasons unknown to me.

Whether I create a 2K image in one step or upscale one from 1K, I often—but not always—get horizontal stripes. I initially suspected it was caused by a LoRA, and while it sometimes seems worse when I use LoRAs, the issue also happens with vanilla Flux. Using FP16 doesn’t help either; in fact, it might even make things worse.

It’s not just my images either. I often see horizontal stripes in Flux-generated images online, although they are usually subtle, as most people don’t post upscaled Flux images in the places I visit.

Occasionally, I manage to create almost stripe-free images at 2K, and sometimes I get no stripes at all with the same image and settings—maybe 1 out of 10 times—but I have no idea why.

I have never had such an issue with any SDXL model.

Please refer to the attached screenshots.

00017-3769130604 00067-3209157356 00010-3588118394

Juqowel commented 2 months ago

Can you provide the prompt and settings example?

LilithDragoness commented 2 months ago

Sure. In the second image, it's quite obvious. As for the third one, you can see it in front of her mouth, on the fire. This time, the artifacts appear more like squares than stripes.

The issue is that my 1K to 2K upscales sometimes end up looking like the second image. It’s a bit tricky to reproduce this without using LoRAs, because I believe it depends on the colors or the color balance of the image, which affects how subtle or strong the issue becomes. Many LoRAs, including those I’ve trained, tend to generate more homogeneous areas, like shiny skin, which might make the problem more noticeable

For example, it's particularly noticeable and distracting in the breast area on her: https://www.deviantart.com/lilithdragoness/art/Reptilian-Dragoness-Goddess-1094968131

Here, it's on her shoulder and face: https://www.deviantart.com/lilithdragoness/art/Succubus-1094998180

--

Version: f2.0.1v1.10.1-previous-506-gd1e40361 Commit hash: d1e403619dd6a9570052c1e99869fd108f071a72

Model: flux1-dev-fp8.safetensors

Prompt: The eternal goddess of darkness and fire closeup, goddess portrait, dark feminine.

Seed: 3778897589

1: Resolution: 1024 x 1024

"The eternal goddess of darkness and fire closeup, goddess portrait, dark feminine. Steps: 20, Sampler: Euler, Schedule type: Simple, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 3778897589, Size: 1024x1024, Model hash: 275ef623d3, Model: flux1-dev-fp8, Version: f2.0.1v1.10.1-previous-506-gd1e40361

Time taken: 1 min. 16.1 sec."

2: Resolution: 2048 x 2048

"The eternal goddess of darkness and fire closeup, goddess portrait, dark feminine. Steps: 20, Sampler: Euler, Schedule type: Simple, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 3778897589, Size: 2048x2048, Model hash: 275ef623d3, Model: flux1-dev-fp8, Version: f2.0.1v1.10.1-previous-506-gd1e40361

Time taken: 1 min. 28.3 sec."

3: Img2img, Resize by 2, same prompt, same seed

"The eternal goddess of darkness and fire closeup, goddess portrait, dark feminine. Steps: 20, Sampler: Euler, Schedule type: Simple, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 3778897589, Size: 2048x2048, Model hash: 275ef623d3, Model: flux1-dev-fp8, Denoising strength: 0.75, Version: f2.0.1v1.10.1-previous-506-gd1e40361

Time taken: 1 min. 11.5 sec."

00015-3778897589 00016-3778897589 00005-3778897589

Python 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
Version: f2.0.1v1.10.1-previous-506-gd1e40361
Commit hash: d1e403619dd6a9570052c1e99869fd108f071a72
Legacy Preprocessor init warning: Unable to install insightface automatically. Please try run `pip install insightface` manually.
Launching Web UI with arguments: --share
Total VRAM 48677 MB, total RAM 45140 MB
pytorch version: 2.3.1+cu118
xformers version: 0.0.27+cu118
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA RTX A6000 : native
Hint: your device supports --cuda-malloc for potential speed improvements.
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: False
Using xformers cross attention
Using xformers attention for VAE
ControlNet preprocessor location: /notebooks/bin/forge/models/ControlNetPreprocessor
2024-09-06 06:03:13,418 - ControlNet - INFO - ControlNet UI callback registered.
Model selected: {'checkpoint_info': {'filename': '/notebooks/bin/forge/models/Stable-diffusion/flux1-dev-fp8.safetensors', 'hash': 'be9881f4'}, 'additional_modules': [], 'unet_storage_dtype': None}
Using online LoRAs in FP16: False
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://xxx.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Startup time: 32.2s (prepare environment: 3.6s, import torch: 18.9s, initialize shared: 0.4s, other imports: 1.7s, list SD models: 0.2s, load scripts: 3.5s, create ui: 2.0s, gradio launch: 1.8s).
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
------------------
[Low VRAM Warning] You just set Forge to use 100% GPU memory (47652.00 MB) to load model weights.
[Low VRAM Warning] This means you will have 0% GPU memory (0.00 MB) to do matrix computation. Computations may fallback to CPU or go Out of Memory.
[Low VRAM Warning] In many cases, image generation will be 10x slower.
[Low VRAM Warning] To solve the problem, you can set the 'GPU Weights' (on the top of page) to a lower value.
[Low VRAM Warning] If you cannot find 'GPU Weights', you can click the 'all' option in the 'UI' area on the left-top corner of the webpage.
[Low VRAM Warning] Make sure that you know what you are testing.
------------------
Loading Model: {'checkpoint_info': {'filename': '/notebooks/bin/forge/models/Stable-diffusion/flux1-dev-fp8.safetensors', 'hash': 'be9881f4'}, 'additional_modules': [], 'unet_storage_dtype': None}
[Unload] Trying to free all memory for cuda:0 with 0 models keep loaded ... Done.
StateDict Keys: {'transformer': 780, 'vae': 244, 'text_encoder': 198, 'text_encoder_2': 220, 'ignore': 0}
Using Detected T5 Data Type: torch.float8_e4m3fn
Using Detected UNet Type: torch.float8_e4m3fn
Working with z of shape (1, 16, 32, 32) = 16384 dimensions.
K-Model Created: {'storage_dtype': torch.float8_e4m3fn, 'computation_dtype': torch.bfloat16}
Model loaded in 43.2s (unload existing model: 0.3s, forge model load: 42.9s).
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.
[Unload] Trying to free 7725.00 MB for cuda:0 with 0 models keep loaded ... Done.
[Memory Management] Target: JointTextEncoder, Free GPU: 36235.00 MB, Model Require: 5154.62 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 30056.38 MB, All loaded to GPU.
Moving model(s) has taken 7.49 seconds
Distilled CFG Scale: 3.5
[Unload] Trying to free 4999.49 MB for cuda:0 with 0 models keep loaded ... Current free memory is 30606.41 MB ... Done.
[Memory Management] Target: KModel, Free GPU: 30606.41 MB, Model Require: 0.00 MB, Previously Loaded: 11350.07 MB, Inference Require: 1024.00 MB, Remaining: 29582.41 MB, All loaded to GPU.
Moving model(s) has taken 0.01 seconds
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:16<00:00,  1.18it/s]
[Unload] Trying to free 4563.84 MB for cuda:0 with 0 models keep loaded ... Current free memory is 30588.25 MB ... Done.█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:16<00:00,  1.16it/s]
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 30588.25 MB, Model Require: 159.87 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 29404.38 MB, All loaded to GPU.
Moving model(s) has taken 0.37 seconds
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:21<00:00,  1.06s/it]
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:21<00:00,  1.16it/s]
[Unload] Trying to free 1024.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30134.87 MB ... Done.
Distilled CFG Scale: 3.5
[Unload] Trying to free 5242.88 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30126.87 MB ... Done.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:23<00:00,  4.20s/it]
[Unload] Trying to free 17424.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30118.37 MB ... Done.████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:21<00:00,  4.33s/it]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:25<00:00,  4.28s/it]
[Unload] Trying to free 14136.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30026.37 MB ... Done.████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:25<00:00,  4.33s/it]
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.
[Unload] Trying to free 5242.88 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30115.87 MB ... Done.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [01:06<00:00,  4.16s/it]
[Unload] Trying to free 17424.00 MB for cuda:0 with 1 models keep loaded ... Current free memory is 30103.87 MB ... Done.████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [01:04<00:00,  4.31s/it]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [01:08<00:00,  4.27s/it]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [01:08<00:00,  4.31s/it]

Juqowel commented 2 months ago

Well. I guess the problem is 2k resolution. Flux work resolution - from 0.1 mp to 2.0 mp which is not equal to 2k (2048 x 2048 ~ 4.0mp). It looks like the model is simply not designed to work correctly with higher resolution, including img2img. You can ask about this issue here. They should know better.

LilithDragoness commented 2 months ago

Flux adds stripes or squares to 1K images as well, but they’re much less noticeable than at higher resolutions. I usually don’t spot them unless I’m specifically looking for them. However, when the image is being rendered, you can clearly see the stripes in the preview. I never see this with SDXL. I wouldn’t mind if they disappeared in the final result, but unfortunately, they remain.

I see digital artists moving toward 4K resolution, and I do the same with SDXL. I hope Flux receives an update soon to address this issue.

Thank you for checking it.

Juqowel commented 2 months ago

Flux adds stripes or squares to 1K images as well

Squares. You also can see it at previews (Settings - Live preview display period). This is related to how flux works.

But its relatively rare thing at small resolution.

LilithDragoness commented 2 months ago

What’s frustrating is that the presence of stripes doesn’t just depend on the image itself—on rare occasions, I’ve been able to upscale problematic images to 2K with almost no artifacts, but only after several attempts, and I have no idea what changed.

I can’t consistently replicate the same results, even with the same image and prompt. It feels like a solution is just out of reach—maybe something as simple as tweaking the prompt, rearranging the order of the LoRAs, or making small adjustments, and then trying again with the original prompt—but I can’t seem to figure it out.

If it never worked at all, I could move on, but the inconsistency makes it harder to let go.

camoody1 commented 2 months ago

I have a very similar issue when upscaling Flux images in ComfyUI. However, my stripes are vertical. They really ruin an otherwise great looking image. You can clearly see the lines in the sky on the right.

BaseMe2 commented 2 months ago

We could mitigate this if we had multi diffusion. But going by lllyas it's not going to happen. https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/159#issuecomment-1936855061 However, I see this as necessity with Flux, as a lot of people won't be able to upscale with the controlnet tiled upscale workflow, requiring additional gigabytes of VRAM.

Dravoss commented 2 months ago

i had the horizontal bars in some upscaled pictures before too but its very inconsistent to replicate.

Juqowel commented 1 month ago

I did some tests and found only one relatively good solution.

img2img your "flux result" with SDXL(any relevant) and these settings:

sttngs

You can experiment with it.

djasil commented 1 month ago

This was reported before #1633.

Generating a black background in 1920x1080 (no upscale) is enough to reproduce the issue: 00010-4204815242 "completely black background" Steps: 20, Sampler: [Forge] Flux Realistic, Schedule type: Simple, CFG scale: 1, Distilled CFG Scale: 3.5, Seed: 4204815242, Size: 1920x1080, Model hash: 6d44448945, Model: flux1-dev-Q8_0

Juqowel commented 1 month ago

I have found that in some cases any resolution can cause this. Example (896x1152, all standart settings):

00015-2106305554

LilithDragoness commented 1 month ago

I believe the issue worsens when creating a Flux LoRA using Flux-generated images, which is a common approach for achieving a consistent character, for example.

vmetternich commented 1 month ago

Get the same issues in comfyUI too.

jonan10075 commented 1 month ago

I have the same problem and it is quite a serious problem since images like this may not be of any use.

Arnaud3013 commented 1 month ago

It's a limit of Flux. It can't go upper 2M pixel. You still can use upscaling on your 1024x1024 pic (else you can still generate at 1408*1408 almost max available and then upscale a bit)

zer0int commented 1 month ago

I was able to reproduce this even for 1024x1024 images by shuffling components (MLP, Attn) of either Flux.1-Dev, but also of the Text Encoders. Especially only shuffling T5 MLP over a long range of Layers seems to be sufficient to reliably reproduce the "stripes". I presented the issue as well as my findings to GPT-4o together with technical info (e.g. guidance-distilled rectified flow transformer, etc.). GPT-4o pointed to positional embeddings (without me ever mentioning positional embeddings in the prompt, so that's a good sign), here: RoPE (Rotary Positional Embeddings).

Proposed solutions (now using GPT-4o + o1) included delusional ones (involving retraining of entire model with all weights requiring gradient), but also some about the way interpolation is handled. There are methods such as the interpolation used in Long-CLIP 248 Tokens CLIP-L, but those are "classic" positional embeddings. Not RoPE. I don't have the kind of ML background to make an educated guess here, I have to rely on the stochastics of GPT-4o's proposals. Maybe somebody here has an idea, so I am sharing this for brainstorming (I don't have a solution yet!).

Also, keep in mind this always applies (is an issue with the model itself, not Forge or ComfyUI or whatever). The way text encoder input is handled differs, but I was also able to reproduce this via the command line / using diffusers pipe.

positional-embeddings-result

Note that semantic context is of course lost (this is expected), as transformers are hierarchical, so if you mess up the order in which they process things, well... You mess them up. In overly simplified terms, you're asking AI to draw the rest of the owl and then draw two circles, which does not compute. But yeah, the side-effect of being able to pretty reliably produce the striping artifacts by messing up T5 text encoder embeddings in such a way is certainly 'curious'.

how-transformers-work

PPS: I have ComfyUI nodes and a CLI script to reproduce this. It's pretty rough code and the ComfyUI nodes are not at all memory efficient.

lllyasviel / stable-diffusion-webui-forge

Flux starts adding horizontal stripes to images around 2K resolution #1712