XLabs-AI / x-flux

Apache License 2.0
1.65k stars 118 forks source link

png为4个channel时报错 #49

Closed xqiprogramming closed 3 months ago

xqiprogramming commented 3 months ago

rank0]: Traceback (most recent call last): rank0: File "/mnt/workspace/wuyu_workspace/download/x-flux/train_flux_lora_deepspeed.py", line 301, in

rank0: File "/mnt/workspace/wuyu_workspace/download/x-flux/train_flux_lora_deepspeed.py", line 218, in main rank0: x_1 = vae.encode(img.to(accelerator.device).to(torch.float32)) rank0: File "/mnt/workspace/wuyu_workspace/download/x-flux/src/flux/modules/autoencoder.py", line 303, in encode rank0: z = self.reg(self.encoder(x)) rank0: File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(*args, kwargs) rank0: File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, *kwargs) rank0: File "/mnt/workspace/wuyu_workspace/download/x-flux/src/flux/modules/autoencoder.py", line 161, in forward rank0: hs = self.conv_in(x): File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl rank0: return self._call_impl(args, kwargs) rank0: File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl rank0: return forward_call(*args, **kwargs) rank0: File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward rank0: return self._conv_forward(input, self.weight, self.bias) rank0: File "/mnt/data/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward rank0: return F.conv2d(input, weight, bias, self.stride, rank0: RuntimeError: Given groups=1, weight of size [128, 3, 3, 3], expected input[1, 4, 1024, 1024] to have 3 channels, but got 4 channels instead

arrowonstr commented 3 months ago

I think there are transparent pictures(.png) in your dataset It have 4 channels RGBA May be pre-operate your dataset,for example image.convert('RGB') will work

Anghellia commented 3 months ago

Hi! Yeah, it seems there is a different format of your input images, so they don't match the model architecture. Check this line, the img should have 3 channels

xqiprogramming commented 3 months ago

I think there are transparent pictures(.png) in your dataset It have 4 channels RGBA May be pre-operate your dataset,for example image.convert('RGB') will work

thx, isfixed it