lux-org / lux

Automatically visualize your pandas dataframe via a single print! 📊 💡
Apache License 2.0
5.08k stars 363 forks source link

[Feature Request] RAPIDS cuDF GPU option in addition to pandas #478

Open exactlyallan opened 2 years ago

exactlyallan commented 2 years ago

@dorisjlee would you be open to adding cuDF backed functionality as an option? I believe there could be several places where GPU acceleration would help over pandas with larger datasets.

We currently have several other visualization projects that integrate with cuDF, such as datashader, and our own cuxfilter.

dorisjlee commented 2 years ago

Hey @exactlyallan, Thanks for the suggestions! We're open to contributions and areas of improvement with a cuDF backend. This would likely involve extending a new plotting backend for Lux that uses cuDF. The plotting backends we currently support are Altair and matplotlib. I think this will help quite a bit with rendering large scatterplots (which is currently plotted as binned heatmap). You can find some of the past performance benchmarks that we've done in this VLDB paper: https://arxiv.org/pdf/2105.00121.pdf

exactlyallan commented 2 years ago

@dorisjlee We are starting to work on this - could you suggest any profiling tools or techniques you are using that work with lux?