matheusccouto / palpiteiro

Fantasy soccer tips with machine learning and genetic algorithm.
MIT License
0 stars 0 forks source link

Data quality on training set #111

Closed matheusccouto closed 1 year ago

matheusccouto commented 1 year ago

The query above produced outliers that should be checked:

The query above produced excess zeros that should be checked:

SELECT
    datetime(timestamp) AS timestamp,
    position,
    total_points_last_5,
    offensive_points_last_5,
    defensive_points_last_5,
    total_points_repr_last_5,
    offensive_points_repr_last_5,
    defensive_points_repr_last_5,
    spi_club,
    spi_opponent,
    prob_club,
    prob_opponent,
    prob_tie,
    importance_club,
    importance_opponent,
    proj_score_club,
    proj_score_opponent,
    total_points_club_last_5,
    offensive_points_club_last_5,
    defensive_points_club_last_5,
    total_allowed_points_opponent_last_5,
    offensive_allowed_points_opponent_last_5,
    defensive_allowed_points_opponent_last_5,
    penalties_club_last_5,
    penalties_opponent_last_5,
    received_penalties_club_last_5,
    received_penalties_opponent_last_5,
    played_last_5,
    avg_odds_club,
    avg_odds_opponent,
    avg_odds_draw,
    IF(total_points <= 0.01, 0.01, total_points) AS total_points
FROM
    palpiteiro.fct_player
WHERE
    played IS TRUE
    AND played_last_5_at > 0
    AND position != 'coach'
matheusccouto commented 1 year ago

Outliers problems solved. Errors on source tables.