Add expected pass model

Alek050 commented 1 month ago

A physical model that predicts the likelyhood of a successfull pass given the locations and velocities of all players, the initial ball velocity, and the ball moving angle.

jonas-bischofberger commented 1 week ago

I have started working on the implementation of the model and am currently encountering two major pain points:

Normalized coordinates (wrt attacking direction) are not by default included in the processed data even though they must be used somewhere such as in xG and xT - is there a standardized way to obtain them that I'm missing? At the moment, I would do databallpy.features.add_team_possession to get the possession info and use that to calculate the normalized coordinates myself.
The tabular tracking data format I obtain via match.tracking_data does not make sense to me - currently a row corresponds to an entire frame of tracking data rather than a object-position pair. But this means that I don't have and can't add any meta data about the players (e.g. to identify which team a player belongs to) and also can't join player identities with the event data (e.g. to exclude the passer from potential receivers). Is there a built-in way to get a different table format and to get the missing mapping information between tracking and event data?

Alek050 commented 1 week ago

Hi @jonas-bischofberger, thanks for your message and great to see that you started!

right now there is not a build in way to normalize coordiantes wrt attacking direction. The playing direction for home team is always from left to right, and for the away team from right to left. I will open an issue to create a build in way to get normalized coordinates wrt attacking direction.

For now, there are two scenarios: if you need only the tracking and event data at the moment of the pass, use the team_id column in the event data to find out whether it is the match.home_team_id or the match.away_team_id. If it is the away team id, you have multiply all _x, _vx (and _ax) columns by -1 in the tracking data, and the start_x, start_y (and end_x, end_y) in the event data. If you need it normalized for all frames, not only the ones where events happen, the approach you use right now is the only solution.

This was a design choice at the beginning of the package. All the metadata about the players can be found in match.home_players and match.away_players. You can use match.player_id_to_column_id() to match player ids to the column id in the tracking data (which is f"{team_side}_{jersey_number}). Also check out the match.home_players_column_ids() or match.away_players_column_ids() to get a list of column ids for an entire team.

Lastly, check out the match.passes_df or the match.pass_events for more info. For instance, match.pass_events is a dict with PassEvents with attributes like team_side, start_x. The PassEvents should work generally, but is still in beta use so some bugs might be in there. On top of that, I have limited access to metrica data so there might be some weir edge cases.

If you have any ideas/updates on how to make the package more intuitive and easier to use, please let me know so I can make some changes to the package and make it easier for anyone to use.

Alek050 / databallpy

Add expected pass model #242