PySport / kloppy

kloppy: standardizing soccer tracking- and event data
https://kloppy.pysport.org
BSD 3-Clause "New" or "Revised" License
328 stars 55 forks source link

Add offensive & duel event types for SB, Opta, Wyscout #240

Closed DriesDeprest closed 5 months ago

JanVanHaaren commented 7 months ago

Thank you for this contribution, @DriesDeprest.

How are "offensive" and "defensive" duels defined exactly? Do Wyscout and Stats Perform use the same definitions?

DriesDeprest commented 7 months ago

Here are the definitions found in the documentation: Opta:

image

Wyscout: https://dataglossary.wyscout.com/offensive_duel/ https://dataglossary.wyscout.com/defensive_duel/

StatsBomb: Does not mark duels as offensive or defensive thus we use our own custom logic.

JanVanHaaren commented 7 months ago

Thanks, @DriesDeprest. My question about the definitions for "offensive" and "defensive" duels was referring to kloppy. Your pull request proposes to add two additional members to the DuelType class, but it doesn't provide definitions.

I had actually looked up the "official definitions" from Stats Perform and Wyscout myself too, but it was not clear from the documentation that their definitions are the same given that Stats Perform's definitions are extremely brief.

DriesDeprest commented 7 months ago

Apologies, now I understand what you meant. Thinking it through, I came up with these 2 proposals regarding the definitions:

  1. Each kloppy duel must have one and only one of these 3 qualifiers:
    • Offensive: a duel performed by a player in possession of the ball
    • Defensive: a duel performed by a player not in possession of the ball
    • Loose ball: a duel performed when no clear team is in possession

The logic used in our deserializers would then be: If duel has a loose ball qualifier -> Loose ball duel Elif duel has offensive qualifer from provider -> Offensive duel Elif duel has defensive qualifer from provider -> Defensive duel Elif team id of player performing duel == team id of ball owning team -> Offensive duel Elif team id of player performing duel != team id of ball owning team -> Defensive duel Else -> Error

  1. Each kloppy duel must have one and only one of these 2 qualifiers:
    • Offensive: a duel performed by a player in possession of the ball
    • Defensive: a duel performed by a player not in possession of the ball

The "Loose ball" qualifier is evaluated seperately from the offensive / defensive qualifier.

The logic used in our deserializers would then be: If duel has a loose ball qualifier -> Loose ball duel

If duel has offensive qualifer from provider -> Offensive duel Elif duel has defensive qualifer from provider -> Defensive duel Elif team id of player performing duel == team id of ball owning team -> Offensive duel Elif team id of player performing duel != team id of ball owning team -> Defensive duel Else -> Error

What do you think? I'm most of fan of the first proposal.

JanVanHaaren commented 7 months ago

Thank you for the proposal, @DriesDeprest.

I have the following two thoughts at the moment.

  1. What would be the advantage of still relying on tags or qualifiers from the data providers? Would the behavior not be more consistent and predictable if we applied our own logic?
  2. To me, performing a duel in possession or out of possession feels like a different concept than, for instance, performing a duel in the air or on the ground. Do we need to introduce a new concept to represent this information?

@koenvo @probberechts What are your thoughts on this proposal?

koenvo commented 7 months ago

The main objective of kloppy is to provide standardised output, independent of the vendor. This is something we should keep in mind.

The qualifiers make it pretty hard to make certain attributes available for non-opta data. As I mentioned in a different issue I'm in favour of deriving attributes/state where possible as it will result as consistent and predictable behaviour for every vendor.

The implementation for StatsBomb seems pretty nice. This logic doesn't need to be implemented in the serializer but can be added as a property to the DuelEvent. This will also result in this logic being available for all future serializers.

Agree on point 2) in Jan's post.