TheAlgorithms / Python

All Algorithms implemented in Python
https://the-algorithms.com/
MIT License
184.14k stars 44.32k forks source link

Implement XGBoost classification and regression algorithms #8067

Open tianyizheng02 opened 1 year ago

tianyizheng02 commented 1 year ago

Feature description

machine_learning/xgboost_classifier.py and machine_learning/xgboost_regressor.py are how-tos since they both just use an existing library for the actual ML algorithms.

My understanding is that #7106 and #7107 were merged (not without difficulty) and the author was warned not to contribute such how-tos in the future. However, I think these algorithms should still be implemented at some point if the files are to remain in the repo, so I thought I should open an issue to bring some attention to it.

If anyone wants to implement either of these two algorithms (as in not relying on an existing library for the bulk of the algorithm), feel free to just open a PR—no need to request an assignment.

rohan472000 commented 1 year ago

While implementing the XGBoost algorithm from scratch in scikit-learn is technically feasible, it would require significant time and expertise in both machine learning and software development. Given that there are already well-established libraries for XGBoost, such as the XGBoost Python package and LightGBM, it may not be practical to invest resources into implementing XGBoost in scikit-learn without relying on an existing library. However, if someone is interested in pursuing this challenge, they are welcome to do so and contribute it to scikit-learn.

tianyizheng02 commented 1 year ago

Given that there are already well-established libraries for XGBoost, such as the XGBoost Python package and LightGBM, it may not be practical to invest resources into implementing XGBoost in scikit-learn without relying on an existing library.

@rohan472000 Implementing algorithms is the whole purpose of this repo. From the contributing guidelines:

Algorithms in this repo should not be how-to examples for existing Python packages.

Importing an XGBoost class from sklearn would absolutely be a how-to example. If that requires significant time and expertise in ML, then so be it. After all, the algorithm implementations in this repo are meant for educational purposes, and figuring out how to implement an algorithm is itself an educational experience.