SilvioGiancola / SoccerNetv2-DevKit

Development Kit for the SoccerNet Challenge
MIT License
168 stars 39 forks source link

About training clips with TemporallyAwarePooling #45

Closed gmberton closed 2 years ago

gmberton commented 2 years ago

Hi, I was wondering why at training time (of TemporallyAwarePooling - NetVLAD++) you don't also consider overlapping clips? Wouldn't this produce a lot more training data (~30x more)? PS: many thanks for the repo: it's wonderful to download a repo and see that it runs without errors out of the box! Also, the fact that training lasts 50 minutes and is reproducible is amazing! The ML community needs more people like you :)

SilvioGiancola commented 2 years ago

Hi @gmberton, I remember trying overlapping clips for NetVLAD++, but couldn't get better performances. Also, consider that training with 30x more data would take more 30x more memory to create the clips, and 30x more time for training.

While I generally agree with the statement "the more data, the better the model", I don't think overlapping clip will actually provide meaningful novel data, but only a different shuffling of the data. Consider that NetVLAD is a set pooling method, and as such, it is order invariant (2 sets in case of TemporallyAwarePooling). Also, with a sliding windows of 1 frame, the next window will only remove 1 old frame and add 1 new frame, which still share most of the frames feature.

With that being said, I do believe that a smarter way to generate/selecting the clips could improve the performances. For instance, the current implementation of TemporallyAwarePooling for training does not even center the clips around the actions. Maybe one could investigate a Hard Negative Mining method to generate those clips, and maybe regress a temporal offset, similar to CALF.

PS: Thank you for your kind words! :)