Open DriesDeprest opened 10 months ago
A few thoughts:
DataRecord
class since it is also possible to compute metrics for tracking data frames (e.g., pitch control).from typing import Optional, Dict, Union, List
from dataclasses import dataclass, field
from abc import ABC, abstractmethod
import numpy as np
@dataclass
class Metric(ABC):
name: str
provider: Optional['Provider'] = None
@dataclass
class ScalarMetric(Metric):
value: float
@dataclass
class PlayerMetric(Metric):
value: Dict['Player', float]
@dataclass
class SurfaceMetric(Metric):
value: np.ndarray
def value_at(self, loc : Point):
return value[loc.y, loc.x]
Then, you can define classes for the most common metrics as
class ExpectedGoals(ScalarMetric):
"""Expected goals""""
name = "xG"
class PostShotExpectedGoals(ScalerMetric):
""""Post-shot expected goals"""
name = "PsXG"
class GameStateValue(ScalarMetric):
"""Game state value""""
gsv_scoring_before: Optional[float] = field(default=None)
gsv_scoring_after: Optional[float] = field(default=None)
gsv_conceding_before: Optional[float] = field(default=None)
gsv_conceding_after: Optional[float] = field(default=None)
@property
def gsv_scoring_net(self) -> Optional[float]:
return None if None in (self.gsv_scoring_before, self.gsv_scoring_after) else self.gsv_scoring_after - self.gsv_scoring_before
@property
def gsv_conceding_net(self) -> Optional[float]:
return None if None in (self.gsv_conceding_before, self.gsv_conceding_after) else self.gsv_conceding_after - self.gsv_conceding_before
@property
def value(self) -> Optional[float]:
if None in (self.gsv_scoring_before, self.gsv_scoring_after, self.gsv_conceding_before, self.gsv_conceding_after):
return None
return (self.gsv_scoring_after - self.gsv_scoring_before) - (self.gsv_conceding_after - self.gsv_conceding_before)
Good point 😅. I agree that adding a list of Statsitic
s is probably a better way to keep it clean and still have a lot of flexibility in adding statistics.
Agree.
Makes sense!
Fine to use "statistic" in this terminology.
Below, an updated version of how the DataRecord
class would change, based on your inputs:
@dataclass
class DataRecord(ABC):
"""
DataRecord
Attributes:
dataset: Reference to the dataset this record belongs to.
prev_record: Reference to the previous DataRecord.
next_record: Reference to the next DataRecord.
period: See [`Period`][kloppy.domain.models.common.Period]
timestamp: Timestamp of occurrence.
ball_owning_team: See [`Team`][kloppy.domain.models.common.Team]
ball_state: See [`BallState`][kloppy.domain.models.common.BallState]
statistics: List of Statistics associated with this record.
"""
dataset: "Dataset" = field(init=False)
prev_record: Optional["DataRecord"] = field(init=False)
next_record: Optional["DataRecord"] = field(init=False)
period: "Period"
timestamp: float
ball_owning_team: Optional["Team"]
ball_state: Optional["BallState"]
statistics: List[Statistic] = field(default_factory=list)
I'll probably be working on implementing this in the near future and adding a parser for StatsBomb.
Expected goals By adding an optional
xg
attribute to ourShotEvent
class, we can support the widely used expected goal property in kloppy. This property could be fed by the raw input data during deserialization (e.g. StatsBomb) or in a later stage could be calculated by the user using an xG model of choice.Proposed implementation:
Game state values By adding optional
gs_scoring_before
,gs_scoring_after
,gs_conceding_before
andgs_conceding_after
attributes to ourEvent
class, we can support the widely used game state based value models in kloppy. This property could be fed by the raw input data during deserialization (e.g. StatsBomb's on-the-ball value models) or in a later stage could be calculated by the user using a game state value model of choice (e.g. VAEP).Proposed implementation:
Any feedback is highly welcome!