First of all, thank you for opensouring amazing code.
I have a question with Mask Embedding. For mask embedding, I've checked that difference of gaussian kp heatmap and deformed source image are needed(to concatenate).
What's your intent behind this movement encoding? Is there any reference of this?
Through Same block and Hourglass prediction, how can movement encoding act as a mask?
First of all, thank you for opensouring amazing code.
I have a question with Mask Embedding. For mask embedding, I've checked that difference of gaussian kp heatmap and deformed source image are needed(to concatenate).
What's your intent behind this movement encoding? Is there any reference of this?
Through Same block and Hourglass prediction, how can movement encoding act as a mask?
Thank you!