Closed zzmao closed 3 years ago
Hi, thanks for your attention.
Can you share some failure cases with me? If the foreground color is very similar to the background, the blurred boundaries are reasonable. For the image in the Github README file, since we converted it to the GIF format, I think its quality is too low to get good results.
Standard Lena image
@mosvlad I am sorry that such a wrong output is make sense in our model due to (1) similar foreground and background color; (2) we have limited training data.
Thanks for replying @ZHKKKe Looks like the project is target for video matting. Is there anything we can do to optimize image(portrait) matting? (Volunteer myself to work on this)
@zzmao The main problem of our current model is its relatively poor performance in portrait semantic estimation. I think one possible solution is to improve the performance of the backbone model, i.e., the MobileNetV2, in MODNet.
@ZHKKKe What should be the approach to improve the performance of the backbone model, i.e., the MobileNetV2? and also can you tell me what type of data you used for training? Can you please tell me the approach to train on our own data set?
@alan-ai-learner Q1: What should be the approach to improve the performance of the backbone model, i.e., the MobileNetV2? You can replace the MobileNetV2 with a more powerful model, e.g., DeepLabV3+. Besides, you may need more labeled training data. You may be interested in the large labeled dataset that will be released soon by BackgroundMattingV2.
Q2: can you tell me what type of data you used for training? Each of our supervised training samples is a pair of (RGB image, labeled matte). The unlabeled samples used in our SOC adaptation are the RGB images.
Q3: Can you please tell me the approach to train on our own data set? Our training code will be released next month. The code will contain a template for implementing the new dataloader. It allows you to train on your own datasets. We will also provide a guideline on how to do this.
Thanks, but in # BackgroundMattingV2 they are using a different approach, in # BackgroundMattingV2 we need to pass two images one with the subject and the other is without a subject (only background). But in MODnet we need to pass only one image. so how can we utilize the dataset?
@alan-ai-learner Yes. I think their dataset only consists of the labeled foregrounds. They use the images from other datasets, like COCO, to composite the training samples. Therefore, their dataset can be used to train MODNet (You only need to input the composited images for training, i.e., you do not need to input the separate background images).
@ZHKKKe Do you think there will be an improvement in accuracy by combining supervised training with the estimation of foreground color?
@ZHKKKe got it
@newjavaer I think that might help, but I'm not sure about it. The lack of the labeled data is a more crucial problem. The current version of MODNet is mostly wrong with portrait semantic estimation, rather than detail prediction.
@ZHKKKe Why do you think there are more errors in semantic estimation? Except for the lack of data.
Many erros are caused by recongnizing cloth as a part of person. Maybe can be improved by giving some penalty during the training. I have collected some person matting data from several semantic datasets, and new CELEBA-MASK-HQ dataset, it may imporve the result.
@QuantumLiu By adding some background images as negative samples to the training?
@newjavaer Semantic estimation is a high-level vision task. It is much more difficult than detail prediciton (low-level vision).
@ZHKKKe Why do you think there are more errors in semantic estimation? Except for the lack of data.
@QuantumLiu Yes. If we use the data from semantic datasets to train the Low-Resolution Branch of MODNet, we should get a more stable results.
The solution proposed by @newjavaer is also great. We do not consider the negtive samples during training since it is a engineering problem.
Please feel free to reopen this question if you have any questions.
Has anyone tried replacing Low-Resolution Branch backbone. How is the result? @QuantumLiu @mosvlad @zzmao cc @ZHKKKe
@syfbme The performance may be further improved. Please refer to https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.3/contrib/Matting
Hi,
Thanks for this great project. I tried your colab for image matting, it looks like the boundary is not clear enough for some inputs(also the one in Github readme).
Is there anyway to improve image matting accuracy?