Open laurentmih opened 5 years ago
Thanks for your mention! Although the eroded method was not mentioned in the paper "deep image matting", I also found that the unbalanced ground truth label generally exists. I conducted a new experience and got a better performance which is more closely to paper.
Hi, I just quickly wanted to say thanks: I used your structure to re-implement the model myself using
resnet34
instead ofVGG16
, but turned to your code for inspiration or when I got stuck.Anyways, I wanted to mention, you write here that you're not adding a Sigmoid at the end because your results converge to zero.
I ran into the same issue. The model learns within a single batch to only predict
0
s. For me, the issue was that I had not added erosion to the matting stage. The result is that the "ground truth" for the model mostly contains0
s (>90%), causing it to predict0
s everywhere. I fixed it by also adding erosion, the same amount as dilation. This "balances" the labels to contain both1
s and0
s, punishing a model that only predicts0
s (the explanation here is a bit rubbish, happy to elaborate more if you want me to).I saw here that you're only dilating, not eroding. Just wanted to suggest you check it out, maybe you're running into the same problem I had. Adding erosion might enable the usage of a sigmoid.
Thanks again for the inspiration!