Zhangjinso / PISE

123 stars 28 forks source link

quality about the par_sav #33

Closed pilgrim00 closed 3 years ago

pilgrim00 commented 3 years ago

I am curiosity about the quality of the output parsing map of p2.And I Try to train for several days on V100. The final png is so good.But the par_sav is different from the SPL2. image

image

image

image

And I have seen the constrains over them,which should helps to make them very similiar.But how this happends with no head,no skin in the img. And I want to ask for whether the coordconv is useful.

Zhangjinso commented 3 years ago

Hi, for the first question, I guess it may be due to using the logits as the 'parsing result' for the image generator for human pose transfer, which may lead to a lower loss with limited regions (it seems due to the upper clothes and lower clothes are the larger region of the human parsing). For the second question, we use coordconv in spatial-aware normalization, which is useful, like position encoding, to learn spatial correspondence in our opinion. But we did not conduct an ablation study about this part.