Description

This pull request proposes to improve the accuracy of the current program that uses linear regression analysis to predict the number of wins for a given baseball team based on its performance statistics in the previous season. By including additional factors such as pitching statistics and defensive metrics in the regression analysis, we can improve the accuracy of the model.

Changes Made

Added a section to the README.md file explaining the importance of including additional factors and using more sophisticated regression techniques to improve the accuracy of the program. Added a new function to the program that scrapes pitching statistics and defensive metrics from the team's page on Baseball Reference. Modified the existing regression analysis function to include the new variables in the analysis. Updated the README.md file with instructions on how to use the updated program.

Expected Outcome

By including additional factors and using more sophisticated regression techniques, we can improve the accuracy of the program in predicting the number of wins for a given baseball team. This can help analysts make better predictions and inform decision-making in various fields, including sports analytics.

Additional Information

It is important to consider the trade-off between model complexity and prediction accuracy. Adding too many variables can lead to overfitting and reduced predictive power, while too few variables can result in an oversimplified model that does not capture all the relevant factors. Therefore, we carefully selected the variables to include in the model based on their statistical significance and practical relevance.

Related Issue

This pull request fixes #5 .

markjacksonfishing / sf_giants_stats

Include additional factors to improve the prediction accuracy #6

Description

Changes Made

Expected Outcome

Additional Information

Related Issue