rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.44k stars 904 forks source link

[FEA] Enable Polars GPU execution via global configuration (default options) #16723

Open beckernick opened 2 months ago

beckernick commented 2 months ago

Polars provides a Config object that can be used globally or as a context manager.

Recent discussions have clarified that some users (and downstream libraries) would benefit from being able to globally configure Polars to use the GPU engine rather than configuring it on a per-collect call basis.

The Polars Config system is environment variable based, so there's some complexity challenges in supporting a full set of configuration options via strings. But, as @wence- has noted, it would likely be more straightforward to enable the default configuration via an environment variable.

We should implement a Config option for the default GPU engine configuration, as it would be useful to many end-users and library developers building on top of Polars.

vyasr commented 1 month ago

Note: Ibis recently implemented the functionality to forward the engine down to Polars, which in theory means that we no longer need a global Polars config for that case to work (there are of course other valid use cases).

bdice commented 4 weeks ago

It would be good for this global configuration to support changing the RMM memory resource. It would be sufficient to have some coarse control like what we have for libcudf benchmarks via a CLI flag --rmm_mode and cudf.pandas via an environment variable CUDF_PANDAS_RMM_MODE.