Scrapping tennis match video

simi2525 commented 5 years ago

Since we have been unable to find any publicly available datasets which contain videos of the actual matches taking place, I propose we start collecting said data from current and future tournaments using live broadcasts.

Some considerations

Since storage space might present an issue we have to settle on some compromises.

We can consider tracking only a small number of players so that we only have to store a smaller dataset of matches.

Another thing we have to make a compromise on is the video quality. We should evaluate the video quality we actually require, looking at work done by other on similar or connected fields to estimate image resolution and FPS.

hhroberthdaniel commented 5 years ago

If we use pose estimation, we can store only the movement of the skeleton, which should be hugely smaller

simi2525 commented 5 years ago

If we use pose estimation, we can store only the movement of the skeleton, which should be hugely smaller

I agree that we don't have to keep a backlog of all videos and we can probably calculate poses in real-time but, we would have a challenging problem on our hands.

Normal broadcasts will contain lots of irrelevant shots such as shots of the crowd, the referee, sponsors or closeups on the players faces also, there are always pauses between actual play. We would either have to make a solution that would be robust to such "noise" in the data or be able to somehow automatically flag when the feed is actually showing the play and not irrelevant scenes. It will also probably have to automatically attribute each pose to one of the players correctly.

Not saying it's impossible just that it probably will be hard to implement. I was thinking about storing the videos because we could manually annotate and select only the relevant sequences of video, after that we could calculate pose estimations and get rid of the video.

hhroberthdaniel commented 5 years ago

I would still focus initaly if we can gen a better accuracy than bet365 by using embeddings.

On Fri, Jun 28, 2019, 4:03 PM Cristian Simionescu notifications@github.com wrote:

If we use pose estimation, we can store only the movement of the skeleton, which should be hugely smaller

I agree that we don't have to keep a backlog of all videos and we can probably calculate poses in real-time but, we would have a challenging problem on our hands.

Normal broadcasts will contain lots of irrelevant shots such as shots of the crowd, the referee, sponsors or closeups on the players faces also, there are always pauses between actual play. We would either have to make a solution that would be robust to such "noise" in the data or be able to somehow automatically flag when the feed is actually showing the play and not irrelevant scenes. It will also probably have to automatically attribute each pose to one of the players correctly.

Not saying it's impossible just that it probably will be hard to implement. I was thinking about storing the videos because we could manually annotate and select only the relevant sequences of video, after that we could calculate pose estimations and get rid of the video.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Tensor-Reloaded/Tennis-Betting/issues/1?email_source=notifications&email_token=AFPOQNN7ZNHHXIPB76CGJ6DP4YD2LA5CNFSM4H4D4BZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY2AMRI#issuecomment-506725957, or mute the thread https://github.com/notifications/unsubscribe-auth/AFPOQNOPUXRIDQYH7URPNM3P4YD2LANCNFSM4H4D4BZQ .

simi2525 commented 5 years ago

I agree, however if we intend to go in this direction in the future, I propose start collecting recordings as soon as possible.

Tensor-Reloaded / Tennis-Betting

Scrapping tennis match video #1

Some considerations