JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.79k stars 157 forks source link

Please add CatBoost or any alternate package (pure Julia) which can beat it #992

Closed MrBenzWorld closed 1 year ago

MrBenzWorld commented 1 year ago

I'm committed to learn julia . I have tried MLJFLUX, BetaML, MLJ models, Evotrees etc...

but, nothing is giving similar to Catboost Performance and quality of results (MLJFLUX is near but computaionally expesnive).

Im testing for regression and I need to get high quality results for research pulication.

I appriciate entire julia team and MLJ. I like it. I hope, you will consider my request. Thank you

ablaom commented 1 year ago

There is a discussion about this here: https://github.com/beacon-biosignals/CatBoost.jl/issues/9 .

Seems a pity you are not able to get what you want from EvoTrees.jl, which should be similar to CatBoost but pure julia. The main developer of EvoTrees.jl is quite active and, it seems to me, open to feature requests.

In my view, it's a better use of limited resources to improve pure Julia implementations than wrapping python/C implementations. And in the case of gradient tree boosters, we already have julia and MLJ interfaces for XGBoost and LightGBM. Do we really need a 4th tree booster?

Of course if someone is interested in an MLJ interface for CatBoost.jl, then I am happy to provide guidance.

MrBenzWorld commented 1 year ago

I also thought, EvoTrees is a good option for Julia.

I request, please add EvoTrees with Treezen optimization with MLJ.

Also, show one example that, it increased the quality of results. Sorry for the discussion. But ,it is Julia learner problems. Once, if I get proper results, I will continue with it ,otherwise, I'm forced by python again.

You can use Boston, or AMES MLJ datasets for implementation. @ablaom

ablaom commented 1 year ago

I think these various tools are individually well-documented. If you have a specific tutorial you'd like to see, please make a request at https://github.com/JuliaAI/DataScienceTutorials.jl/issues

MrBenzWorld commented 1 year ago

Yes, they are well documented individually. It is really helpful for beginners and it helped me a lot.

But there is a problem getting the best results.

We have chosen Julia for fast and best results.

MLJ is best with all tools ( specially with pure Julia packages) ,

MLJ+evotrees+Treezen (or Latin etc) There is no example. This example should show best performance with great accurate results. ( time and results ).

It should beat python based XGBoost ,LightGboost optuna optimization.

You have given an example of how to use it. But didn't show ,how to get best.

As a Julia learner, it is a suggestion. Julia don't have access to kaggle ,but you have Julia cloud or colab . somewhere ,you can also share competition notebooks.

Thanks for your efforts and cooperation @ablaom

MrBenzWorld commented 1 year ago

Yes, I will do the request as you mentioned. Thank you.

ablaom commented 1 year ago

https://github.com/beacon-biosignals/CatBoost.jl/pull/16

MrBenzWorld commented 1 year ago

Thank you very much .@ablaom

ablaom commented 1 year ago

You're welcome. 'm just providing guidance. The main work is being carried out by @tylerjthomas9.

ablaom commented 1 year ago

Closed as completed: https://github.com/JuliaAI/CatBoost.jl#mlj-example