PySport / kloppy

kloppy: standardizing soccer tracking- and event data
https://kloppy.pysport.org
BSD 3-Clause "New" or "Revised" License
328 stars 55 forks source link

Opta deserializer incorrectly sets end coordinate of saved shots #244

Closed probberechts closed 6 months ago

probberechts commented 7 months ago

If a shot has qualifier "102" (i.e., a goal mouth y-coordinate), the Opta deserializer automatically sets the end-x coordinate of the shot to 100 and uses the value of the qualifier to set the shot's end-y coordinate.

https://github.com/PySport/kloppy/blob/096f959b22df6f45f887fc45f7ec8d5063a958b2/kloppy/infra/serializers/event/opta/deserializer.py#L505-L507

This seems incorrect if the shot was not saved on the goalline. In the Opta data stream, saved shots have two y-coordinates. One coordinate for the location where the shot was blocked and one coordinate which (I assume) is based on the projection of the shot on the goal mouth.

  <Event id="1124226663" event_id="435" type_id="15" period_id="1" min="38" sec="47" player_id="14464" team_id="174" outcome="1" x="92.6" y="57.9" timestamp="2018-08-20T21:39:10.635">
    <Q id="1201627127" qualifier_id="17" />
    <Q id="1138078681" qualifier_id="22" />
    <Q id="1370931446" qualifier_id="102" value="52.1" />  <!-- Goal mouth y-coordinate -->
    <Q id="1083530127" qualifier_id="103" value="18.4" /> <!-- Goal mouth z-coordinate -->
    <Q id="1591211723" qualifier_id="76" />
    <Q id="1133507498" qualifier_id="56" value="Center" />
    <Q id="1898374933" qualifier_id="55" value="433" />
    <Q id="1208960709" qualifier_id="29" />
    <Q id="1463101346" qualifier_id="328" />
    <Q id="1713152233" qualifier_id="154" />
    <Q id="1320398977" qualifier_id="147" value="52.5" />  <!-- Blocked y-coordinate -->
    <Q id="1946686767" qualifier_id="146" value="99.1" />  <!-- Blocked x-coordinate -->
    <Q id="2107391537" qualifier_id="233" value="289" />
    <Q id="2102498620" qualifier_id="15" /> 
  </Event>

I think the "blocked coordinates" should be used here for the x-coordinate at least. That would be consistent with the StatsBomb parser. I am not sure what to do with the goal mouth coordinates. Do we add an extra attribute (intended_coordinates) for shots, drop the y-coordinate and use the z-coordinate as is, drop the z-coordinate, ...

I am also wondering how the y-and z-coordinates are defined by StatsBomb for saved shots. Are they also projected on the goal mouth?

JanVanHaaren commented 7 months ago

I am also wondering how the y-and z-coordinates are defined by StatsBomb for saved shots. Are they also projected on the goal mouth?

My understanding is that StatsBomb records the x-, y-, and z-coordinates of the location where the goalkeeper saved the shot.

Example 1: https://youtu.be/CIiGAh4H6Js?si=TAbFdNoTaAMxXcMG&t=56

"location": [
    99.5,
    41.6,
    0.0
]

"end_location": [
    116.4,
    40.8,
    2.0
]

Example 2: https://youtu.be/CIiGAh4H6Js?si=Ka8dL3ySS0OhXQK_&t=163

"location": [
    111.3,
    28.9,
    0.33
],

"end_location": [
    113.8,
    31.1,
    0.8
],
JanVanHaaren commented 7 months ago

In the Opta data stream, saved shots have two y-coordinates. One coordinate for the location where the shot was blocked and one coordinate which (I assume) is based on the projection of the shot on the goal mouth.

The Stats Perform documentation provides the following descriptions for qualifiers 102, 103, 146, 147, 230 and 231.

However, qualifiers 230 and 231 come with the following note.

This can be shown for live data (for example, for eventTypeIds 14 and 16). However, from June 2020 it can also be added post-match to Blocked Shot, Miss (not past the goal line) and all shot events, but only for some competitions. When added as a post-match qualifier it is only shown once the post-match analysis is complete.