5-shot training - Githubissues

Joseph-Lee-V commented 2 years ago

Hello, Thanks for your great work on FSS! I have some confusion about Implementation details of 5-shot training.

According to #8, did it mean that the dataloader give 5 support-query, then model gives 5 loss and sum these losses to backward?
With Figure A12, is it more rational to get five support images and one query image for 5-shot training?
According to section4.5 and #18, what's meaning of "maximum voting score"? Thank you.

juhongm999 commented 2 years ago

What I meant for n-shot training was to pass a batch of 1-query & n-support images and (perhaps) average the n mask results to get a single mask for loss.
My apology. I don't understand what you mean by "to get five support images ... for 5-shot training".
For example, our model provides 5 mask predicions where each pixel is in {0,1} given 5 support images, and these predictions are element-wise summed (you think of this as a voting process for each pixel position) to provide a single mask in {0, ..., 5}. Dividing by the maximum voting score means all the voting scores are normalized so that they are in [0, 1].

Joseph-Lee-V commented 2 years ago

Question2 is similiar to question1, now I'm clear about these issues. Thanks for your response!

juhongm999 / hsnet