maudzung / TTNet-Real-time-Analysis-System-for-Table-Tennis-Pytorch

Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
https://arxiv.org/pdf/2004.09927.pdf
586 stars 157 forks source link

Trained with Gloal Model alone does't giving valid predictions on the test sequences #15

Open tjpreddy opened 3 years ago

tjpreddy commented 3 years ago

Trained TTNet global model using ttnet_1st_phase.sh. Using the above trained model for new test sequence always gives fixed prediction for the entire video. python test.py --working-dir ../ --saved_fn ttnet_1st_phas --no-val --batch_size 8 --num_workers 1 --lr 0.001 --lr_type 'step_lr' --lr_step_size 10 --lr_factor 0.1 --global_weight 5. --seg_weight 1. --no_local --no_event --no_seg --smooth-labelling --show_image here!! number of trained parameters of the model: 7781952 loading pre-trained model 0%| | 0/54 [00:00<?, ?it/s] ===================== batch_idx: 0 ================================

Ball Detection - Global stage: (x, y) - gt = (199, 57), prediction = (0, 72) Ball Detection - Overall: (x, y) - org: (1196, 486), prediction = (0, 607)

egehancosgun commented 3 years ago

Did you solve this problem I have the same issue

AugustRushG commented 2 weeks ago

I don't think they fully implement the code correctly, as in the original paper the ball detection module has two last fully connected layers one produces X and one produces Y representing the coords. However, in their code, they have only one that produces a vector in the shape of width + height = 448. So there is some mismatch which can be the possible reason why the ball detection stage in this repo is not working so well when comparing to original paper.

AugustRushG commented 1 week ago

Just want to add, you can try adding --thresh_ball_pos_mask 0.00001 to the test.sh file which should allow it to produce valid output.