PeterL1n / BackgroundMattingV2

Real-Time High-Resolution Background Matting
MIT License
6.81k stars 950 forks source link

"L" mode returns a grayscale object instead of transparency (alpha) channel #114

Closed Anuj040 closed 3 years ago

Anuj040 commented 3 years ago

Thanks a lot for an amazing work. I was trying to replicate this work but I have noticed that the objects returned by the following line of code is not an alpha matte (issue being with the "L" mode as it converts the image to grayscale) and this should have some serious implications on the model training. Please let me know if I am approaching it from a wrong angle or it is a mistake. https://github.com/PeterL1n/BackgroundMattingV2/blob/eca0f27fea2436b38c8ce61d64114ad49076fa49/train_refine.py#L90 To return the alpha matte, I have implemented the following instead and it works fine

 with Image.open(self.filenames[idx]) as img:
       img = img.convert("RGBA")
       img = img.split()[-1]
PeterL1n commented 3 years ago

For our training, we don't provide RGBA png images. Rather our images are separated as RGB foreground images and Alpha images. They are put under fgr and pha directories. So when loading, the alpha image is converted to 1 channel using L mode. If you have custom datasets, you need to change the dataloader code. But L was intended and not an issue.

Anuj040 commented 3 years ago

Understood. Thanks a lot.