juhongm999 / hsnet

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, ICCV 2021
231 stars 43 forks source link

5-shot training #28

Closed Joseph-Lee-V closed 2 years ago

Joseph-Lee-V commented 2 years ago

Hello, Thanks for your great work on FSS! I have some confusion about Implementation details of 5-shot training.

  1. According to #8, did it mean that the dataloader give 5 support-query, then model gives 5 loss and sum these losses to backward?
  2. With Figure A12, is it more rational to get five support images and one query image for 5-shot training?
  3. According to section4.5 and #18, what's meaning of "maximum voting score"? Thank you.
juhongm999 commented 2 years ago
  1. What I meant for n-shot training was to pass a batch of 1-query & n-support images and (perhaps) average the n mask results to get a single mask for loss.
  2. My apology. I don't understand what you mean by "to get five support images ... for 5-shot training".
  3. For example, our model provides 5 mask predicions where each pixel is in {0,1} given 5 support images, and these predictions are element-wise summed (you think of this as a voting process for each pixel position) to provide a single mask in {0, ..., 5}. Dividing by the maximum voting score means all the voting scores are normalized so that they are in [0, 1].
Joseph-Lee-V commented 2 years ago

Question2 is similiar to question1, now I'm clear about these issues. Thanks for your response!