VincentGranville / Main

Main folder. Material related to my books on synthetic data and generative AI. Also contains documents blending components from several folders, or covering topics spanning across multiple folders..
https://mltechniques.com/product/ebook-synthetic-data/
85 stars 21 forks source link

XGBoost alternative #1

Closed Sandy4321 closed 1 year ago

Sandy4321 commented 1 year ago

I see The author introduces a simple alternative to XGBoost from https://mltechniques.com/product/ebook-synthetic-data/

may you clarify what do you mean ?

is it 2.2.1 How hidden decision trees (HDT) work ?? from https://github.com/VincentGranville/Main/blob/main/MLbook4-extract.pdf

VincentGranville commented 1 year ago

Hi Sandy,See the article with the details, at https://mltechniques.com/2022/09/11/advanced-machine-learning-with-basic-excel/.To access the PDF document, the password is MLT12289058.Best,Vincent--Vincent Granville, Ph.D.Author and PublisherMLTechniques.com

      On Dec 19 2022, at 10:42 am, Sandy4321 ***@***.***> wrote:

I see The author introduces a simple alternative to XGBoost from https://mltechniques.com/product/ebook-synthetic-data/ may you clarify what do you mean ? is it 2.2.1 How hidden decision trees (HDT) work ?? from https://github.com/VincentGranville/Main/blob/main/MLbook4-extract.pdf

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

Sandy4321 commented 1 year ago

great thanks for pdfs from https://mltechniques.com/resources/

by the way how it is different from your new books https://mltechniques.com/shop/

especially from eBook: Intuitive Machine Learning and Explainable AI

VincentGranville commented 1 year ago

Hi Sandy,The most recent and updated version is in the books "Synthetic Data" and "ML and explainable AI". The book has numerous internal clickable cross-references to other related sections and chapters (via the index, glossary and other navigation mechanisms). The book may contain more material, compared to a specific PDF that you obtained for free. It also contains material not posted anywhere else.  Also typos are fixed as soon as detected. It is also updated based on feedback from the community.Finally, I offer a refund if you don't find the value that you are looking for in the book. And you get any major revision for free if you are in the mailing list or request the latest version, say in a year. You also get additional explanations directly from me if you have specific questions about some of the content, once you purchased the book.Best,Vincent --Vincent Granville, Ph.D.Author and PublisherMLTechniques.com

      On Dec 20 2022, at 3:14 pm, Sandy4321 ***@***.***> wrote:

great thanks for pdfs from https://mltechniques.com/resources/ by the way how it is different from your new books https://mltechniques.com/shop/ especially from eBook: Intuitive Machine Learning and Explainable AI

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Sandy4321 commented 1 year ago

great generous suggestion

but still I want to make clear do you have in your book material to show that hidden decision trees (HDT) performing significantly better than xgboost for some valuable data set example?

valuable data set - meaning data often used in industry , what can be generalized to many other practicable data sets

meaning not some isolated specially tuned case when xgboost is bad, but hidden decision trees (HDT) is better ?

Also when xgboost is well trained , not just some initial hyperparameters ?

VincentGranville commented 1 year ago

Hi Sandy,I did not benchmark my algorithm, as the goal was to produce a simpler algorithm that is easy to implement and was useful for my NLP application. I don't expect it to outperform XGBoost, though it has features that could lead to more robustness / less overfitting. As you wrote, both XGBoost and HDT have many parameters that can be fine-tuned to show one is "better" than the other, but looking for specific parameters or data to make a point defeats the idea that it should be a fully automated procedure; it amounts to cherry-picking. Cheers,Vincent--Vincent Granville, Ph.D.Author and PublisherMLTechniques.com

      On Dec 21 2022, at 1:14 pm, Sandy4321 ***@***.***> wrote:

great generous suggestion but still I want to make clear do you have in your book material to show that hidden decision trees (HDT) performing significantly better than xgboost for some valuable data set example? valuable data set - meaning data often used in industry , what can be generalized to many other practicable data sets meaning not some isolated specially tuned case when xgboost is bad, but hidden decision trees (HDT) is better ? Also when xgboost is well trained , not just some initial hyperparameters ?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

Sandy4321 commented 1 year ago

may you share any updates in this direction pls for example I see https://towardsdatascience.com/tuning-xgboost-with-xgboost-writing-your-own-hyper-parameters-optimization-engine-a593498b5fba

https://towardsdatascience.com/xgboost-how-deep-learning-can-replace-gradient-boosting-and-decision-trees-part-2-training-b432620750f8

https://towardsdatascience.com/xgboost-how-deep-learning-can-replace-gradient-boosting-and-decision-trees-291dc9365656 XGBoost: How Deep Learning Can Replace Gradient Boosting and Decision Trees — Part 1