joslefaure / HIT

Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”
https://arxiv.org/abs/2210.12686
55 stars 9 forks source link

Some questions regarding the results #35

Closed ZWXCV closed 6 months ago

ZWXCV commented 10 months ago

Thank you for the author's reply, but why is the value of video mAP that I received significantly different from the value given in your paper, about 3% or 5%? Should I make modifications to certain parameters to achieve the results shown in your paper?I hope the author can provide some suggestions. Thank you very much

joslefaure commented 10 months ago

From the configs, try replacing the structure from hitnet to serial. And please tell me the results you got

ZWXCV commented 10 months ago

Do you mean to change the STRUCTURE: "hitnet" to STRUCTURE: "serial"? However, an error occurs when the training is performed after modification: Traceback (most recent call last): File "train_net.py", line 255, in main() File "train_net.py", line 245, in main args.no_head) File "train_net.py", line 100, in train mem_active, File "/17106/zwx/HIT-master/hit/engine/trainer.py", line 59, in do_train loss_dict, weight_dict, metric_dict, pooled_feature = model(slow_video, fast_video, boxes, objects, keypoints, mem_extras) File "/root/anaconda3/envs/hit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/17106/zwx/HIT-master/hit/modeling/detector/action_detector.py", line 20, in forward result, detector_losses, loss_weight, detector_metrics = self.roi_heads(slow_features, fast_features, boxes, objects, keypoints, extras, part_forward) File "/root/anaconda3/envs/hit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "/17106/zwx/HIT-master/hit/modeling/roi_heads/roi_heads_3d.py", line 12, in forward result, loss_action, loss_weight, accuracy_action = self.action(slow_features, fast_features, boxes, objects, keypoints, extras, part_forward) File "/root/anaconda3/envs/hit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/17106/zwx/HIT-master/hit/modeling/roi_heads/action_head/action_head.py", line 44, in forward x, x_pooled, x_objects, x_keypoints, x_pose = self.feature_extractor(slow_features, fast_features, proposals, objects, keypoints, extras, part_forward) File "/root/anaconda3/envs/hit/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/17106/zwx/HIT-master/hit/modeling/roi_heads/action_head/roi_action_feature_extractor.py", line 145, in forward ia_feature, res_person, res_object, res_keypoint = self.hit_structure(person_pooled, proposals, object_pooled, objects, hands_pooled, keypoints, memory_person, None, None, phase="rgb") ValueError: too many values to unpack (expected 4)

The result shows an incorrect number of parameters

ZWXCV commented 10 months ago

The result before modification was:

VideoAP_ 0.5: 84.80

VideoAP_ 0.2: 86.40

joslefaure commented 10 months ago

Yes, hitnet to serial as you did.

Thanks for reporting the numbers. Others have pointed out disparities of results for different runs (though I used seed). I will fix the serial bug and get back to you.

joslefaure commented 10 months ago

By the way, what is the frame mAP result for that run?

ZWXCV commented 10 months ago

frame mAP : 81.02

ZWXCV commented 10 months ago

Hello author, I found that the weights of some parameters were not loaded during the training process, will this affect the final training results and mAP calculation? If so, what is the cause? Is the weight file I loaded incomplete? Could the author give me some suggestions, thank you very much. Here are some log messages from the training process: 2023-11-12 10:39:16,803 hit.utils.model_serialization INFO: backbone.slow.res_nl4.res_2.btnk.conv3.bn.weight loaded from backbone.slow.res_nl4.res_2.btnk.conv3.bn.weight of shape (2048,) 2023-11-12 10:39:16,803 hit.utils.model_serialization INFO: backbone.slow.res_nl4.res_2.btnk.conv3.conv.weight loaded from backbone.slow.res_nl4.res_2.btnk.conv3.conv.weight of shape (2048, 512, 1, 1, 1) 2023-11-12 10:39:16,803 hit.utils.model_serialization INFO: roi_heads.action.feature_extractor.fc1.bias will not be loaded. 2023-11-12 10:39:16,803 hit.utils.model_serialization INFO: roi_heads.action.feature_extractor.fc1.weight will not be loaded.

joslefaure commented 6 months ago

In your case, weights not being loaded might be because you are training from an existing checkpoint.

I uploaded the pretrained model. Possibly due to the dataset being very small, different runs might give different results. I also tested different checkpoints (for JHMDB the model converges very fast then start overfitting).