how to use my own soccer video frame, aka what is the input requirement for this model?

lood339 / SCCvSD

Sports Camera Calibration via Synthesic Data

BSD 2-Clause "Simplified" License

71 stars 19 forks source link

how to use my own soccer video frame, aka what is the input requirement for this model? #5

Closed rambleramble closed 4 years ago

rambleramble commented 4 years ago

Thanks very much for sharing this code, great work!

(1) If I am using my own soccer video data for testing, what are the preprocessing I should conduct? (2)when calculating the test image HoG features in "generate_test_feature_hog.py", it seems the edge_map information is pre-computed and stored in ./data/features/testset_feature.mat. Could you shed some light on how to compute these edge-map in the first place from the raw video frame? I think that's what I am missing from using my own dataset.

Thanks in advance!

lood339 commented 4 years ago

Thanks for interesting in our work.

If you use your own soccer video, there are two main assumptions. First, the camera should roughly locate at the center of the playing ground. If the camera is at the corner of the soccer field, the method will fail. Second, soccer fields have different sizes in practice and the size we use is 74 x 115 yards. So, the real soccer field size should be close to these values.
The edge map is generated from a GAN network. Here is the link: https://github.com/lood339/pytorch-two-GAN

Hope that helps.

rambleramble commented 4 years ago

great, thanks very much for this prompt and clear explanation! One more question, which dataset did you use for training and testing your two-GAN model, in "soccer_seg_detection" ? It seems the field markings were manually annotated in the xxx_AB.jpg images.

Also, would it be possible to include a "training from scratch using own dataset" instruction in your pytorch-two-GAN repository?

Thanks again!

lood339 commented 4 years ago

I use the dataset from here 'Sports Field Localization Via Deep Structured Models': http://www.cs.toronto.edu/~namdar/ . The data is pre-processed for the inputs of the network.

Yes, it is definitely possible to train from scratch. For the training set, each image needs a ground truth homogrpahy. The program should needs the actual size of the soccer field.

There is example function 'ut_generate_grassland_mask' here: https://github.com/lood339/SCCvSD/blob/master/python/util/iou_util.py shows part of the data preprocess.

rambleramble commented 4 years ago

great, thanks so much!!

Itzikefraim commented 3 years ago

Hi Ramble,

I am trying to use my own video frames. I am using the demo script and I can't find where I can change the input. Did you have any luck changing the input?