Closed wanna-fly closed 4 years ago
Sorry for the late reply!
Please refer to the statement in Sec. 5.4, Three Streams. The result is achieved by keeping one stream in P each time. Therefore, to reproduce our reported result, you should keep the three items in self.predictions
(they belong to C, thus shouldn't be removed for ablation study), while modify self.binary_discriminator
to keep fc7_H
and fc7_SH
for human only, keep fc7_O
and fc7_SO
for object only, and keep fc_binary_1
for spatial only, then retrain the whole model, finally perform inference with the correspondingly retrained model.
I got it. The ablation study is conducted on P. Thanks for your answer. It really helps.
Hi guys, thanks for your nice code! I'm trying to check the contribution of each stream, but the result is totally different from that in your paper. Here is my method:
prediction_H
generated bynet.test_image_H
and repeat it for all objects paired with the current human instance during the test;self.predictions["cls_prob_O"]
as prediction;self.predictions["cls_prob_sp"]
as prediction for sp stream;I train the network jointly and adopt the above settings during the test. And finally I got a result like this: AP = 37.85 for human stream, AP = 31.63 map for object stream and AP = 47.19 for sp stream. I think there must be something wrong with my method, but I have no idea about it. So would you mind share your strategy of the ablation study? How do you guys get the results of different streams?