Question about random sampling for evaluation

kaixin96 / PANet

Code for our ICCV 2019 paper PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment

315 stars 64 forks source link

Question about random sampling for evaluation #8

Closed Na-Z closed 4 years ago

Na-Z commented 4 years ago

Thank you for your great work.

I have a question regarding the evaluation. As mentioned in the paper, the evaluation is computed by "average the results from 5 runs with different random seeds, each run containing 1,000 episodes."
In this case, do you keep the 5*1000 episodes fixed when you run the baseline methods (such as PL[4] and SG-One[28]) and ablation studies? And in the 2-way setting, is "person" class always held as in PL[4]?

kaixin96 commented 4 years ago

Hi @Na-Z , the results of other methods (e.g. PL, SG-one) are taken from their paper. For our experiments, yes, the 5*1000 episodes are fixed as we use the same seed. I do not quite understand your second question. Do you mean only using images having "person" class?

Thank you.

Na-Z commented 4 years ago

Thank you for your reply.

In the 2-way evaluation of PL, they set "person" as one held-out class and choose another class in Pascal-5^2. Do you take the same strategy? Sorry, I didn't find the reported result of 2-way 5-shot in SG-one paper. May I know where you take it?

kaixin96 commented 4 years ago

In my understanding, for 2-way segmentation in PL, they set split 2 (where 'person' class is) as held-out classes and use other 3 splits for training. I didn't find that they set 'person' as one held-out class.

For SG-one results, we are so sorry that we made a mistake. In their paper, SG-one reports 29.4 for multi-way segmentation, but not for 2-way settings. It looks more like a 5-way few shot segmentation. We are revising our arxiv manuscript to remove this entry in our Table 3. Thank you for pointing it out.

Na-Z commented 4 years ago

Please see the third sentence of Section 5.1 in PL paper: "We sample 500 S-(xq,yq) pairs that contain the "person" and another held-out class for evaluation while trained on the other three subsets." Based on my understanding, the testing 2-way consists of "person" class and another class (i.e. diningtable/dog/horse/motorbike).

kaixin96 commented 4 years ago

Thanks for your updates. Our strategy is slightly different from PL in that we do not set 'person' as the fixed held-out class and we average performance from 4 splits.

ahyunSeo commented 4 years ago

Hello. What I get is that the PL paper reported results for fold-2 (with person class) and PANet reported the average performance for every fold just like in 1-way experiments. I wonder how PANet works for each splits, because in my knowledge fold-0 contains less than 10 val images with 2-way. Feel free to correct me. Thank you.

kaixin96 commented 4 years ago

Hi @ahyunSeo , taking 2-way setting as an example, we simply sample images from images that contain objects from either one of the two classes. If the image only contains one object class, we expect the model to be able to predict nothing for the other class.

ahyunSeo commented 4 years ago

I got it. @kaixin96 Thank you for the clarification.