dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.33k stars 8.73k forks source link

Support dataframe protocol. #10452

Open trivialfis opened 5 months ago

trivialfis commented 5 months ago

https://data-apis.org/dataframe-protocol/latest/index.html

MarcoGorelli commented 1 day ago

Hi @trivialfis

Quick note to say that I'd discourage using the interchange protocol - I've collect some reasons why here: https://github.com/pandas-dev/pandas/issues/56732#issuecomment-2466301769

If I may, I'd like to suggest Narwhals and/or the Arrow PyCapsule Interface. This is what several packages (e.g. Altair, Plotly, Vegafusion, Marimo, scikit-lego, Rio, and more) are using, with several others (Bokeh, Prophet, formulaic) considering doing the same

Happy to give this a go if you'd be open to it

trivialfis commented 1 day ago

Thank you for sharing, will look into these.