Peer Review(my463) - Githubissues

You have done a good job! But there are 3 points need to improve

When cleaning data, I think it would be better to replace the missing id with the latest information of the team with missing id, instead of delete it. For 4,605 samples is not a small part of total sample.
It would be better if you can analyze the correlation in features. Even the dataset is big enough, it can't prevent overfitting. Random forest is a good method in decision tree, and you can apply PCA or LASSO in regression to select features.
{-1, 0, 1} is a classification problem, you'd better use continuous variables in linear regression such as the odds of the game.

ben1605 / Soccer-match-outcome-prediction