Closed yuxwind closed 5 years ago
We treat the Part Segmentation/DensePose predictions as RGB images, so, as soon as you have these predictions, it should be straightforward to simply change the input of the network from RGB to Part Segmentation or DensePose and train it accordingly. Extra care is required only for data augmentation (e.g., during flipping, where this kind of input should be handled differently than RGB).
Good job! I have a question about the inputs.
You present experiments with a variety of inputs include RGB images, segmentation, and DensePose. I find your codes have handled RGB images with ResNet and there seems no code for the others. Do you have any plan to release the code for the other two representation?
I understand that this paper is not to focus on the effect of different input representation and thus you don't release the code on the other two. It will be also great if you could help me with the following questions.
Do you treat segmentation/DensePose prediction as RGB images and feed into ResNet? Or do you use some other encoder, such as DensePose network to get features for GCN?
Thanks!