Open xychen9459 opened 6 years ago
@xychen9459 The pointing-without-prediction means "without predicting object presence/absence". So we calculate class response maps with ground truth labels as in [1]. For the pointing-with-prediction task in [2], the class response maps are computed with predicted object labels. If the prediction of an object category is correct and the corresponding maximum point falls into a ground truth box, a true positive can be counted.
Hi, @yeezhu. I have some questions on experiments in the paper:
In section 4.2, Two different methods ("pointing without prediction" and "pointing with prediction") are used to to evaluate pointing localization. They use the same criterion to pick up positive results (the maximum response falls in gt box), as mentioned in [1] and [2]. However, I can't find the difference between "without prediction" and "with prediction". Could you give me some clue about the difference?
You use the method in [2] to evaluate pointing localization, but I can't find the code to compute the metric. The released code of the original paper [2] also doesn't contain the metric calculation. I have examined all related papers but they don't have released codes to compute the metric. Could you give me your implementation, such as pseudo code or python code. I want to re-implement the results in Table 3.
Thanks.
References: [1] J. Zhang, Z. L. Lin, J. Brandt, X. Shen, and S. Sclaroff. Top-down neural attention by excitation backprop. In ECCV, pages 543–559, 2016. [2] M. Oquab, L. Bottou, I. Laptev, and J. Sivic. Is object localization for free? - weakly-supervised learning with convolutional neural networks. In CVPR, pages 685–694, 2015.