PySport / kloppy

kloppy: standardizing soccer tracking- and event data
https://kloppy.pysport.org
BSD 3-Clause "New" or "Revised" License
362 stars 59 forks source link

Different unit used for shot_event.result_coordinates.z dimension for Opta vs StatsBomb #293

Closed DriesDeprest closed 6 months ago

DriesDeprest commented 8 months ago

Opta: The ground is 0 and the (middle of the) crossbar is at 40 StatsBomb: The ground is 0 and the (middle of the) crossbar is at 2.7

I assume StatsBomb expresses it in yards, for Opta I don't know.

How should we enable users to easily obtain standardized units for the result coordinates of a shot?

JanVanHaaren commented 8 months ago

I believe the Opta coordinates don't really have a unit. According to their documentation, Opta uses values on a 0 to 100 scale for the z coordinate and distinguishes between three zones to which different scaling factors apply to obtain coordinates in meters. They assume the crossbar is 12 centimeters wide.

DriesDeprest commented 8 months ago

Thanks for sharing.

Should the result coordinates z value of events in our kloppy events dataset be in the same unit regardless of the data provider we used as input? Or do we by default just take over the values used by the provider and only if the user specifies what unit (or provider) he wants to use, the values are adapted?

probberechts commented 8 months ago

I implemented it like this on purpose. By default, all coordinates are in the data provider's standard coordinate system. Typically, this means that the x, y and z-coordinates are all in the same unit. Obviously, this breaks as soon as you apply a transformation since the CoordinateSystem currently does not have a z-axis. It seems logical to me to add a z-axis to the coordinate system.

The alternative, that is using a default unit for the z-coordinate, could have strange side effects. For example, having x-and y-coordinates in yards, but z-coordinates in meters would be confusing.

Also, this applies both to goalmouth coordinates in event data and ball z-coordinates in tracking data.