tsing90 / pytorch_semantic_human_matting

This is an unofficial implementation of the paper "Semantic human matting":
https://arxiv.org/pdf/1809.01354.pdf
83 stars 18 forks source link

how does alpha_r look like? #7

Closed gdjmck closed 5 years ago

gdjmck commented 5 years ago

according to the network structure in the origin paper, the output of M-Net alpha_r looks kind of like the final alpha, but the loss of alpha seems just relate to the unknown regions. Did you get the similar output of M-Net like the paper or it was more similar to the unsure layer of trimap? Cause my training result is more like the unsure layer, with most of the foreground region and background region dark.

tsing90 commented 5 years ago

you probably deleted the line of code: alpha_p = fg + unsure * alpha_r (last line of net_M class) which makes the alpha_p similar to gt alpha

gdjmck commented 5 years ago

I know the predicted alpha is generated by the equation above, I am just curious about how the alpha_r standalone looks like. Because there is a figure in the origin paper noted that alpha looks more like groundtruth alpha, instead of just values within unsure region from trimap and the rest remains 0.

judgeeeeee commented 5 years ago

image @gdjmck thats my alpha_r

gdjmck commented 5 years ago

@judgeeeeee thx for your reply! So the two almost identical images in the first row are groundtruth trimap and predicted trimap, the bottom right image is the alpha_r? Does it mean that alpha_r does not really need to contain semantic information or just because the images you offered does not contain much semantic information in the foreground? Since the loss of alpha is masked by the unsure mask, loss of regions outside unsure mask may not cause any loss, so the model shouldn't pay any attention to those regions and that's why I am confused about what alpha_r would look like

judgeeeeee commented 5 years ago

@gdjmck image

the first row are both groundtruth trimap (i use that code for t-net ,m-net ,but m-net needs trimap as input ,so i put groundtruth in that place)

i believe you can understand better through that picture . But i am not sure my m-net is good convergence.

maybe you can show your result.

gdjmck commented 5 years ago

@judgeeeeee yeah, this one makes more sense to me, there are positive values in foreground regions, not just within unsure regions. The alpha_r result I train for m_net was almost identical to the input unsure layer, with little almost can't notice small white dots outside unsure regions, sorry I deleted the tensorboard files yesterday and start off training m_net with a new generated trimap this morning, maybe I'll post my result here when I done pre-training m_net. I notice an interesting view from your alpha_r that shows up in my previous training as well, the areas of alpha_r surrounding unsure regions are clean as a whistle. Maybe that's something the model learned :)

gdjmck commented 5 years ago

@judgeeeeee I think the problem I previous encoutered in was because my model was complicate enough, I deepened the layers of encoder and decoder and it got better performance alpha_r alpha alpha_gt This is a training batch of alpha_r, alpha and alpha_gt at epoch 22 and the loss is descending as expected.

judgeeeeee commented 5 years ago

@gdjmck your alpha_r looks so cooooooooool

sanshibayuan commented 5 years ago

how was your M-Net loss go? I may encounter some problem with my M-Net and I could use some help. Here is my loss, which is pretty low from the beginning and show no sign of descending image

Here are my training results, they show two kinds of diffenrent output (from the same dataset) pic_1 is more alike to the alpha_r you described although I'm not sure how I get it pic_2 is more alike to the real results. image

image

gdjmck commented 5 years ago

how was your M-Net loss go? I may encounter some problem with my M-Net and I could use some help. Here is my loss, which is pretty low from the beginning and show no sign of descending image

Here are my training results, they show two kinds of diffenrent output (from the same dataset) pic_1 is more alike to the alpha_r you described although I'm not sure how I get it pic_2 is more alike to the real results.

Could you explain what is the third column of the images you showed? Is it the unsure map that your m_net predicted? The first row one seems like it, but the second row seems quite not right. Maybe you should check your training data first? I got better results when I checked the unsure map of my training data.

sanshibayuan commented 5 years ago

checked the unsure map of

Thanks for the reply, sorry I didn't get my problem clear Those images are my M-Net prediction(alpha_p), and the middle column are my GTs. During the same traing stage, my M-Net predicted two different kinds of alpha_p, which are in the third row. I think the second one looked more like the alpha_p
And these are my alpha_r, whick seems quite strange and different from yours, they also have two kinds... image image

Any ideas? I will look into my trimap and check again.

gdjmck commented 5 years ago

checked the unsure map of

Thanks for the reply, sorry I didn't get my problem clear Those images are my M-Net prediction(alpha_p), and the middle column are my GTs. During the same traing stage, my M-Net predicted two different kinds of alpha_p, which are in the third row. I think the second one looked more like the alpha_p And these are my alpha_r, whick seems quite strange and different from yours, they also have two kinds...

Any ideas? I will look into my trimap and check again.

So I think it is your T-Net that is not working that causes the whole problem. The T-Net gives the most confidence foreground region. Then the unsure region is optimized by the M-Net which means M-Net just focuses on the unsure region. In your case some output of your model misses the whole foreground region which should be covered by the T-Net. I assume you didn't pretrain the T-Net first? Maybe pretrain the T-Net to some good performance first will help you a lot.