tfzhou / ContrastiveSeg

ICCV2021 (Oral) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation
MIT License
657 stars 86 forks source link

T-SNE of features visualization #42

Open HuadongTang opened 2 years ago

HuadongTang commented 2 years ago

Hi, Thanks for your nice work. I have some questions about T-SNE visualization.

  1. What's the meaning of each point in the T-SNE visualization map of your paper. (Each point is a pixel feature?). As you mentioned in the former issue, features(tensor size[8,256,256,512]) after the projection layer are used. I try to draw the T-SNE map and I reshape the features to 8256512=1048576. Then, I got TensorA (1048576, 256). After that, I randomly sample 5000 from the first dimension of A. But I got a bad T-SNE map. So I wanna know how you handle the features after the projection layer and how many images for T-SNE visualization.
  2. The T-SNE map. Are the features of Pixel-wise Cross-Entropy Loss map from segmentation head? (I think it's the baseline method without contrastive loss, right?) But features of Pixel-wise Contrastive Learning Objective are from the projection layer. (method with contrastive loss? ) I am confused about where are the features of the two loss from.
HuadongTang commented 2 years ago

I reshape the features to 8x256x512=1048576 and got tensor (1048576, 256)

xiewende commented 2 years ago

Can you provides the code of T-SNE of features visualization? thank you very much!!!

lennart-maack commented 1 year ago

Hey everyone,

I implemented the tsne visualization in the following way:

  1. Getting the features for each pixel a. In my case I have a feature embedding of [B, D, H, W] with B being the batch size, D being the number of features for each pixel and H, W the corresponding height and width of the feature embedding. In my case it is [32, 256, 33, 33] b. So I have [32, 256, 1089] --> 256 features for each of the 1089 pixels c. I apply PCA to only get 100 features for each of the pixels d. Afterwards apply tsne

I implemented it for an example with a binary segmentation mask as ground truth (getting the label for each pixel): Link to code

using369 commented 1 year ago

I reshape the features to 8x256x512=1048576 and got tensor (1048576, 256)

Hello,I got the same bad result. Have you solved it? Can we discuss it?

Lemonweier commented 1 year ago

@lennart-maack Hi ,thank you for your code. I want to ask you about D, how do I know the number of features for each pixel. Thank u!

using369 commented 1 year ago

@lennart-maack Hi ,thank you for your code.! I want to ask about domains. What do "either src,src_to_trgt,trgt"stand for?

using369 commented 1 year ago

Maybe we can communicate, my email is 1205060715@qq.com