Open DanLuoNEU opened 1 year ago
For JHMDB Inference
here should be
from models.tuber_jhmdb import build_model
modify
https://github.com/amazon-science/tubelet-transformer/blob/f610c97251e5539256095508570563ca2dc8c7a1/models/tuber_jhmdb.py#L20
as
from models.transformer.transformer import build_transformer
Needs 'JHMDB-GT.pkl' Found the script to download according to the direction in dataset part. Update:
Provided pretrained DETR model has different embedded query input dimensions as the model built and pretrained JHMDB model
modify the loading detr part according to the built model embed_query input dimensions to avoid this problem https://github.com/amazon-science/tubelet-transformer/blob/f610c97251e5539256095508570563ca2dc8c7a1/utils/model_utils.py#L25
pretrained_dict.update({k: v[:query_size]})
if query_size == model.module.query_embed.weight.shape[0]: continue
if v.shape[0] < model.module.query_embed.weight.shape[0]: # In case the pretrained model does not align
query_embed_zeros=torch.zeros(model.module.query_embed.weight.shape)
pretrained_dict.update({k: query_embed_zeros})
else:
pretrained_dict.update({k: v[:model.module.query_embed.weight.shape[0]]})
Got different mAP as the table shows
per_class [0.96529908 0.4870422 0.81740977 0.64671594 0.99981187 0.48678173
0.72522214 0.70157535 0.99132313 0.99332738 0.92539198 0.63780982
0.6607778 0.89695387 0.78694818 0.42965094 0.26324953 0.94429166
0.27346689 0.68134081 0.87238637 nan nan nan]
{'PascalBoxes_Precision/mAP@0.5IOU': 0.7231798302410739, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Basketball': 0.9652990848728149, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/BasketballDunk': 0.4870421987013735, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Biking': 0.8174097664543525, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/CliffDiving': 0.6467159401389935, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/CricketBowling': 0.9998118686054533, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Diving': 0.48678173366600064, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Fencing': 0.7252221388068574, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/FloorGymnastics': 0.7015753486207187, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/GolfSwing': 0.9913231289322941, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/HorseRiding': 0.9933273801597415, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/IceDancing': 0.9253919821730238, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/LongJump': 0.637809816668955, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/PoleVault': 0.6607777957457814, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/RopeClimbing': 0.8969538737505489, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SalsaSpin': 0.7869481765834933, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SkateBoarding': 0.42965094009542815, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Skiing': 0.26324952994810963, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Skijet': 0.9442916605769802, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SoccerJuggling': 0.27346688938240526, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Surfing': 0.681340807090747, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/TennisSwing': 0.8723863740884812, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/TrampolineJumping': nan, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/VolleyballSpiking': nan, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/WalkingWithDog': nan}
mAP: 0.72318
For JHMDB Inference
- here should be
from models.tuber_jhmdb import build_model
as
from models.transformer.transformer import build_transformer
Needs 'JHMDB-GT.pkl' Found the script to download according to the direction in dataset part. Update:
- nope, that link only has Annotations, Frames and OF, but not with the file above.
- Get the pickle file from MOC
- Provided pretrained DETR model has different embedded query input dimensions as the model built and pretrained JHMDB model
modify the loading detr part according to the built model embed_query input dimensions to avoid this problem
pretrained_dict.update({k: v[:query_size]}) if query_size == model.module.query_embed.weight.shape[0]: continue if v.shape[0] < model.module.query_embed.weight.shape[0]: # In case the pretrained model does not align query_embed_zeros=torch.zeros(model.module.query_embed.weight.shape) pretrained_dict.update({k: query_embed_zeros}) else: pretrained_dict.update({k: v[:model.module.query_embed.weight.shape[0]]})
Got different mAP as the table shows
per_class [0.96529908 0.4870422 0.81740977 0.64671594 0.99981187 0.48678173 0.72522214 0.70157535 0.99132313 0.99332738 0.92539198 0.63780982 0.6607778 0.89695387 0.78694818 0.42965094 0.26324953 0.94429166 0.27346689 0.68134081 0.87238637 nan nan nan] {'PascalBoxes_Precision/mAP@0.5IOU': 0.7231798302410739, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Basketball': 0.9652990848728149, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/BasketballDunk': 0.4870421987013735, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Biking': 0.8174097664543525, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/CliffDiving': 0.6467159401389935, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/CricketBowling': 0.9998118686054533, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Diving': 0.48678173366600064, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Fencing': 0.7252221388068574, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/FloorGymnastics': 0.7015753486207187, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/GolfSwing': 0.9913231289322941, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/HorseRiding': 0.9933273801597415, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/IceDancing': 0.9253919821730238, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/LongJump': 0.637809816668955, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/PoleVault': 0.6607777957457814, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/RopeClimbing': 0.8969538737505489, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SalsaSpin': 0.7869481765834933, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SkateBoarding': 0.42965094009542815, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Skiing': 0.26324952994810963, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Skijet': 0.9442916605769802, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/SoccerJuggling': 0.27346688938240526, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/Surfing': 0.681340807090747, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/TennisSwing': 0.8723863740884812, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/TrampolineJumping': nan, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/VolleyballSpiking': nan, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/WalkingWithDog': nan} mAP: 0.72318
Thank you for your correction.Do you find any code about video map inference. I want to reproduce the video map of UCF101-24.
Thanks for taking your time to write this, helped me greatly. It's a shame that the codebase for this model is such a mess as-is.
For the version I am using,
AVA2.1 inference needs several modifications:
https://github.com/amazon-science/tubelet-transformer/blob/f610c97251e5539256095508570563ca2dc8c7a1/datasets/ava_frame.py#L135
For function loadvideo, the function should be reading images with the video name.
video_frame_list = sorted(glob(video_frame_path + vid + '/*.jpg'))
Change the path here for the annotations. https://github.com/amazon-science/tubelet-transformer/blob/f610c97251e5539256095508570563ca2dc8c7a1/evaluates/evaluate_ava.py#L36
The fixes above would get the number listed in the README table. But there would still be a tensorboard error "EOFerror". Add lines after https://github.com/amazon-science/tubelet-transformer/blob/f610c97251e5539256095508570563ca2dc8c7a1/eval_tuber_ava.py#L48
AVA2.2 Inference