PySport / kloppy

kloppy: standardizing soccer tracking- and event data
https://kloppy.pysport.org
BSD 3-Clause "New" or "Revised" License
326 stars 55 forks source link

[StatsPerform] Bugfixes for tracking data (MA25) + Support for event data (MA3) #310

Closed probberechts closed 1 month ago

probberechts commented 2 months ago

Bugfixes for deserializing Stats Perform tracking data (MA1 + MA25 feed)

First, this PR fixes a few bugs in the tracking deserializer:

Support for deserializing Stats Perform event data (MA1 + MA3 feed)

Second, it adds support for deserializing event stream data. The content of the Stats Perform MA3 feed is identical to the Opta F27 feed, only the format is different. To avoid duplicate code, I refactored the OptaDeserializer to first parse the data into OptaEvents which are subsequently deserialized to kloppy objects. I also renamed the OptaDeserializer to StatsPerformDeserializer to convey that it now deserializes feeds that are distributed by Stats Perform (and not only the legacy Opta feeds).

from kloppy import statsperform

dataset = statsperform.load_event(
    ma1_data="kloppy/tests/files/statsperform_event_ma1.json",
    ma3_data="kloppy/tests/files/statsperform_event_ma3.json",
    coordinates="opta"
)

Both the XML and JSON versions of the feed are supported.

Other changes

koenvo commented 1 month ago

@JanVanHaaren were you able to give this a review?

probberechts commented 1 month ago

I was also unsure how to name that input parameter since it is indeed not the MA25 feed itself but the txt file that is referenced in the feed. However, I had never heard of OPT data before, the Stats Perform API docs do not seem to make any mention of it and I cannot find anything via Google either. Hence, from my personal experience, I assume that people can figure out what is meant by ma25_data but I would not know what opt_data is (moreover, people might confuse it with "opta"). In which context is it referred to as OPT data?

JanVanHaaren commented 1 month ago

I agree that naming that input parameter is not trivial. I am fine with keeping ma25_data but I was leaning towards opt_data because the Stats Perform Data Delivery Team always uses the term OPT files in their communication with clubs. Those files can be obtained through the Stats Perform API or the Stats Perform Download Portal.

Dear all,

Please find enclosed the Fitness Report, players analysis & fitness 15min of : Club Brugge v Cercle Brugge

Following files are also available for download from https://pro-download-portal.statsperform.com/ : 

    Fitness Report.
    OPT Files.
    Advanced XML.

Best Regards,

Data Delivery Team – STATS PERFORM
probberechts commented 1 month ago

How about improving the docstring? Something like:

ma25_data: txt file linked in the MA25 Match Tracking Feed; also known as an OPT file
JanVanHaaren commented 1 month ago

How about improving the docstring? Something like:

ma25_data: txt file linked in the MA25 Match Tracking Feed; also known as an OPT file

This solution looks good to me!