Thank you for sharing code for moco. I have two questions for using MOCO in a segmentation task?
In a segmentation task, the input of a network is H*W*3, and the last two feature map should be: H*W*C ---> H*W*1. The last feature map of segmentation task is two dimension rather than one dimension in classification task. How can we calculate the InfoNCE loss for this two dimension feature map?
If the feature map is one dimension, the f_q and f_k are both 1n, and f_qf_k.T will return one value. But if the f_q and f_k are m*n, the f_q*f_k.T will return a matrix.
In my thought, we can translate the f_q/f_k from m*n to 1*mn, and then f_q*f_k.T will return one value. Am I right? Please correct me if my thought is wrong.
In the classification task, the output should be the same for any augmented image. For a image X and a rotated image Y, the output is feature_X, feature_Y, and feature_X should be equal as feature_Y because they are the same class. In this way, feature_X*feature_Y.T should return 1.
However, if imageY=rotation(imageX), feature_Y should be rotation(feature_X). Then, feature_X*feature_Y.T should not be 1. How can we deal with this problem? Or should we inversely rotate feature_Y to obtained irotated_feature_Y, and calculate feature_X*irotated_feature_Y.T?
Hello, I would like to know how we can migrate MoCo to the downstream segmentation task, is it like training BackBone with MoCo first, and then replacing BackBone in the segmentation task?
Thank you for sharing code for moco. I have two questions for using MOCO in a segmentation task?
If the feature map is one dimension, the f_q and f_k are both 1n, and f_qf_k.T will return one value. But if the f_q and f_k are m*n, the f_q*f_k.T will return a matrix.
In my thought, we can translate the f_q/f_k from m*n to 1*mn, and then f_q*f_k.T will return one value. Am I right? Please correct me if my thought is wrong.
However, if imageY=rotation(imageX), feature_Y should be rotation(feature_X). Then, feature_X*feature_Y.T should not be 1. How can we deal with this problem? Or should we inversely rotate feature_Y to obtained irotated_feature_Y, and calculate feature_X*irotated_feature_Y.T?
Any suggestion is appreciated.