The validation image generation is currently producing only random noise patterns (see attached example image) instead of proper decoded images. This appears to be a systematic failure in the VAE decoding pipeline, particularly with bfloat16 handling.
Current Behavior
Validation images show random noise/static pattern
No visible image structure or content
Consistent gray background with small white/black dots
Pattern suggests early pipeline failure rather than just dtype mismatch
Current Implementation
def prepare_image(img):
with torch.cuda.amp.autocast():
if img.shape[1] == 4:
img = self.default_vae.decode(img / 0.18215).sample
img = img.float()
return img
Description
The validation image generation is currently producing only random noise patterns (see attached example image) instead of proper decoded images. This appears to be a systematic failure in the VAE decoding pipeline, particularly with bfloat16 handling.
Current Behavior
Current Implementation
Root Cause Analysis
VAE Initialization:
Potential Issues:
Proposed Fix
Validation Steps
Add tensor validation:
Add checkpoints in decode pipeline:
Testing Plan
Impact