PySport / kloppy

kloppy: standardizing soccer tracking- and event data
https://kloppy.pysport.org
BSD 3-Clause "New" or "Revised" License
356 stars 56 forks source link

Add a provider type to dataset.metadata #36

Closed bdagnino closed 4 years ago

bdagnino commented 4 years ago

I think a good addition in the future would be to add a provider type to the metadata. That way you would know from which provider the data came, to check if you need to for example:

We could add it in the kloppy\domain\common.py, something like:


class Provider(Enum):
    METRICA= "metrica"
    TRACAB = "tracab"
    OPTA= "opta"
    STATSBOMB = "statsbomb"

    def __str__(self):
        return self.value

@dataclass
class Metadata:
    teams: List[Team]
    periods: List[Period]
    pitch_dimensions: PitchDimensions
    score: Score
    frame_rate: float
    orientation: Orientation
    flags: DatasetFlag
    provider: Provider

I'll eventually need it for a project I'm working on and I could do it then, but could also be a good first issue for someone else that want's to contribute.

koenvo commented 4 years ago

Indeed a good first issue!

FCrSTATS commented 4 years ago

Does the Metadata class refer to the tracking data or to the event data or is it a single Metadata class that should refer to both?

if the later, then should Metadata have "tracking_provider: Provider' and "event_provider: Provider"?

bdagnino commented 4 years ago

@FCrSTATS each Dataset has a Metadata field. So for example if you have tracking from Tracab and events from Opta, you would have two datasets, one for tracking (dataset.metadata.provider = Tracab) and one for events (dataset.metadata.provider = Opta)

FCrSTATS commented 4 years ago

thanks Bruno, where is dataset.metadata.provider assigned?

bdagnino commented 4 years ago

It isn't yet, that's what this issue is about 😺

FCrSTATS commented 4 years ago

"is" == "should" :)

seems sensible to add to kloppy\domain\common.py as you suggested but is there a natural point at which it would be used

bdagnino commented 4 years ago

it might be the lack of caffeine, but I'm not sure I understand what you mean.

koenvo commented 4 years ago

Fixed in https://github.com/PySport/kloppy/pull/38