Join me in celebrating my love for the San Francisco Giants! I've crafted a Go program to combine my love for data science and the SF Giants. With this program, you can enter any Major League Baseball team's abbreviation and receive their 2022 season batting stats from Baseball Reference.
MIT License
1
stars
0
forks
source link
Include additional factors to improve the prediction accuracy #6
This pull request proposes to improve the accuracy of the current program that uses linear regression analysis to predict the number of wins for a given baseball team based on its performance statistics in the previous season. By including additional factors such as pitching statistics and defensive metrics in the regression analysis, we can improve the accuracy of the model.
Changes Made
Added a section to the README.md file explaining the importance of including additional factors and using more sophisticated regression techniques to improve the accuracy of the program.
Added a new function to the program that scrapes pitching statistics and defensive metrics from the team's page on Baseball Reference.
Modified the existing regression analysis function to include the new variables in the analysis.
Updated the README.md file with instructions on how to use the updated program.
Expected Outcome
By including additional factors and using more sophisticated regression techniques, we can improve the accuracy of the program in predicting the number of wins for a given baseball team. This can help analysts make better predictions and inform decision-making in various fields, including sports analytics.
Additional Information
It is important to consider the trade-off between model complexity and prediction accuracy. Adding too many variables can lead to overfitting and reduced predictive power, while too few variables can result in an oversimplified model that does not capture all the relevant factors. Therefore, we carefully selected the variables to include in the model based on their statistical significance and practical relevance.
Description
This pull request proposes to improve the accuracy of the current program that uses linear regression analysis to predict the number of wins for a given baseball team based on its performance statistics in the previous season. By including additional factors such as pitching statistics and defensive metrics in the regression analysis, we can improve the accuracy of the model.
Changes Made
Added a section to the README.md file explaining the importance of including additional factors and using more sophisticated regression techniques to improve the accuracy of the program. Added a new function to the program that scrapes pitching statistics and defensive metrics from the team's page on Baseball Reference. Modified the existing regression analysis function to include the new variables in the analysis. Updated the README.md file with instructions on how to use the updated program.
Expected Outcome
By including additional factors and using more sophisticated regression techniques, we can improve the accuracy of the program in predicting the number of wins for a given baseball team. This can help analysts make better predictions and inform decision-making in various fields, including sports analytics.
Additional Information
It is important to consider the trade-off between model complexity and prediction accuracy. Adding too many variables can lead to overfitting and reduced predictive power, while too few variables can result in an oversimplified model that does not capture all the relevant factors. Therefore, we carefully selected the variables to include in the model based on their statistical significance and practical relevance.
Related Issue
This pull request fixes #5 .