Open probberechts opened 11 months ago
Thank you for raising the issues and starting the discussion, @probberechts. I must admit that I didn't think much about the potential implications while reviewing #242. I also completely forgot about the discussion in #135.
As I recently mentioned in #240, it would be helpful if we had formal definitions for each of the kloppy
event types and qualifiers.
Thanks for your input on this, @probberechts.
Regarding 1/ the difference between Challenge & Tackle: To make sure I understand - are you saying that Opta's definition of a Challenge ("player unsuccessfully attempts to tackle an opponent as the opponent dribbles past them") does not mean an unsuccessful tackle but rather an unsuccessful attempt to even make the tackle? If so, I guess I misinterpreted the definition and agree that the current implementation is undesirable.
I think we should add a DribbledPast
event, which would be identified as follows for
Regarding 2/ whether we want a Tackle
qualifier or not. I forgot about the discussion in https://github.com/PySport/kloppy/issues/135. Moving forward, I don't have a strong opinion on whether we have an explicit Tackle
qualifier or whether it is up to the user to recognize tackle events as Ground
DuelEvent
s that are not LooseBall
. Thus, happy to follow what is decided.
@koenvo do you have an opinion on this?
Yes, exactly. The Opta "Challenge" event agrees for ~90% with the StatsBomb's "Dribbled Past" event.
You can find some examples of "Challenge" events in BEL - POR at Euro2020 at
There are 27 Opta "Challenge" and 22 StatsBomb "Dribbled Past" events in this game.
I think we should add a DribbledPast event
I like how StatsBomb defines a duel: "Duel events describe when a defender challenges an attacker in some way".
In a "Dribbled Past"/"Challenge" event, there is always some degree of contact / pressing between the player that dribbles and the one that gets dribbled past. Hence, intuitively, it is some kind of duel and could thus be incorporated in the DuelEvent
.
One option would be to define a "Dribbled Past" event as DuelEvent
with DuelType.GROUND
qualifier and DuelResult.LOST
outcome. A StatsBomb "Duel" with type "Tackle" and an Opta "Tackle" event would be mapped to a DuelEvent
with DuelType.GROUND
and DuelType.TACKLE
qualifiers with DuelResult.WON
or DuelResult.LOST
outcome (depending on whether possession is regained).
I like this proposal because:
Thanks for sharing, I like your proposal.
So would you agree with the following next actions:
Add TACKLE
qualifiers to our DuelEvents for the provider tackle events in the different deserializers:
Recognize the provider dribble(d) past / challenge events and mark them as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST outcome:
I only have a minor remark regarding
Wyscout: Recognize Dribble Past event as DuelEvents with DuelType.GROUND qualifier and DuelResult.LOST / WIN outcome depending on provider event success
The Wyscout docs give the following examples of a won dribble past attempt:
- defending player dispossesses the attacker
- defending player kicks the ball out
- the attacker stays with the ball, but the defender forces him to go back
According to the current implementation of the StatsBomb deserializer, a team has to regain possession after a duel for it to be considered successful. Hence, only the first one would yield a DuelResult.WON
outcome.
I am not sure what the best solution would be here. You could certainly argue that the second and third examples are—albeit to a lesser degree—also successful.
Okay, I understand. For Wyscout v3 I would then apply the logic shown in the screenshot to determine the DuelResult
The stoppedProgress and recoveredPossession can be read from Wyscout v3's raw data.
Do you agree with this approach?
@probberechts I've created an overview of how different duel events of different providers, currently, are parsed by kloppy and two suggestions on how to change that to properly capture the dribbled past events.
I've added an explicit and implicit suggestion on how to adjust the kloppy duel type definitions to be able to support dribbled past events. In the explicit version, we would add a DribbledPast
duel type to explicitly label dribbled past events. In the implicit version, a dribbled past event could be recognized as a duel event with a qualifier Ground and no qualifier Tackle
or LooseBall
.
I've done this exercise for Opta, StatsBomb & Wyscout v3. I'm not planning in the near future to adjust the Wyscout v3 parsing deserialization logic, but wanted to already do the thought exercise to make sure our decision is future-proof in case we will update the Wyscout v3 deserializer.
Do you like any of the two proposals? Which has your preference? Or am I still missing something in my suggestion?
Thanks @DriesDeprest.
I am happy with the assignment of the DuelType
qualifiers and I don't have a strong preference for the explicit or implicit DuelType qualifiers.
Determining the DuelOutcome
is more challenging. I guess the first question is whether we stick to the existing outcomes (WON
, LOST
, NEUTRAL
) or whether we add additional gradations of being successful. Looking at your analysis, I think the following criteria are used for a duel to be successful or unsuccessful by the data providers:
One idea would be to work with qualifiers (i.e., a DuelOutcome
qualifier) for each of these criteria. It should be possible to derive each of them from the data. Then you can derive a default (WON
, LOST
, NEUTRAL
) outcome by combing qualifiers and users can modify this definition if they do not agree. It will require quite a lot of work to implement this though.
The alternative would be to mostly rely on the provider's definitions as in your propososal. Trying to summarize this and stating what's still unclear:
Duel[Aerial, LooseBall]
: we use the data provider's definition to define the outcome as WON
or LOST
.
Duel[Ground, LooseBall]
WON
if the player's team regains possession, otherwise as LOST
.WON
or LOST
. TODO[I cannot find an exact definition for the outcomes]Duel[Ground, Tackle]
:
WON
if the player's team regains possession, otherwise as LOST
.WON
if the player's team regains possession or if the ball goes out of play, otherwise as LOST
. TODO["if the ball goes out of play" is not compatible with StatsBomb]Duel[Ground, DribbbledPast]
:
DribbledPast
or Tackle
?]Thanks for reviewing and sharing your insights @probberechts.
On the explicit vs implicit DuelType qualifiers, my preference would go the explicit suggestion.
Regarding the DuelResult
, I agree that your solution of adding the listed qualifiers that would allow calculating the DuelResult
will result in more predictable and standardized behaviour across data providers. However, I don't feel comfortable committing to develop this logic, as it indeed seems like quite a lot of work.
Therefore, I would suggest that in the short run I refine our current implementation by also recognizing dribbled past events and for now use the providers' outcome labels to determine our result. Thus, I'll follow the logic which we'll agree upon here.
@JanVanHaaren @koenvo any thoughts on this? I'd like to start implementing this, but want to make sure you guys agree with the plan.
Although a bit late, I see a few issues wrt the recently merged PR #242 by @DriesDeprest.
First, the PR incorporates the Opta "Challenge" event in the kloppy
DuelEvent
. This changes the definition of aDuelEvent
that we agreed upon in #135. Previously, duels corresponded to events that require an intervention. Instead, the main use of Opta's "Challenge" event is to describe the player who gets dribbled past when a dribbler takes them on. It means that the player who gets dribbled past either did nothing at all or was not able to touch the ball. Otherwise, the event would have been labeled as a "Tackle". Therefore, the definition of aDuelEvent
in the Opta serializer is no longer consistent with the definition in the StatsBomb and Wyscout serializer.I am not per se against adding the "Challenge" event, but then the StatsBomb and Wyscout serializer should be adapted accordingly and there should be a distinction between -- in Opta terminology -- a tackle and a challenge. A tackle is an intervention, while a challenge is an opportunity to tackle. To draw a parallel, giving the same label to a tackle and a challenge would be like labeling a big chance as a shot. This also sabotages my effort to integrate Kloppy and socceraction because Challenges are not seen as actions in SPADL.
Second, the PR introduces a
DuelType.Tackle
qualifier, which is equivalent toDuelType.GROUND
+ ~DuelType.LOOSE_BALL
. Adding this was previously suggested by @MKlaasman in https://github.com/PySport/kloppy/issues/135#issuecomment-1570161586. Although I don't like redundant qualifiers, I am not strongly against it but it should be added to the StatsBomb and Wyscout parsers too and be documented to avoid confusion regarding the difference between a ground duel and a tackle.