Open bluesky314 opened 4 years ago
Instead of only feeding the network the binary trimap we also feed the distance transformed version. The distance from the definite foreground and background regions is a strong indicator of what the alpha could be.
The clicks variable represents the transformed trimap. The variable name is not the most accurate and will eventually be fixed.
Ok but it does not seem like a simple distance transform. What do 3k,3k+1,3k+2 and 2 ((0.02 * L)2), 2 ((0.08 L)2) ... mean in the for loop? What is the whole loop doing? And why is clicks of dimension 6?
The distance transform is used to compute an approximate alpha matte based on the trimap.
The first function which is used here goes to 0 at approximately a distance for 25 pixels, the second function goes to 0 for a distance of 100 pixels and the third function for 200 pixels.
Here is a plot of the distance of the distance vs the approximate alpha matte value for the first function:
clicks
has 6 channels because the three distances are computed to both the fixed foreground and background of the trimap.
Thanks @99991 and @MarcoForte , but I dont see how distance transform makes sense as the images are all on different scales. Distance in pixel space which is used by distance transform may not mean much when the image is a close-up of a person's face because all points are close by the object of interest.
@99991 , @MarcoForte Can either of you clarify if I am getting something wrong about the appropriateness of distance maps?
@99991 , @MarcoForte Did you guys get to think about the above point that distance transform not taking scale of image into account?
Hey, I do not understand how distance map is being used here and what the clicks variable is exactly supposed to represent:
Can you please explain this?