Closed Ginger-Tec closed 5 months ago
Your analysis is correct. Wide dataframes (many columns, few rows) are not what cuDF is optimized for. That is a fundamental property of the Arrow format, and that property is accentuated on GPUs because of the performance characteristics of memory accesses on GPUs relative to CPUs. #14548 has a lot of good discussion on this topic, so I'd have a look there and see if that discussion matches your expectations. Feel free to follow up here if you have more questions.
Thank you, @vyasr , for your quick and clear response. Thanks to you, I am confident in introducing cuDF in my presentation today!
I conducted performance tests on basic arithmetic operations (+) with two DataFrames of size (4303, 3766) using different libraries: pandas, numpy, polars, and cuDF. Here are the results:
Local Environment (i7-12 Windows)
Colab Environment
Analysis of cuDF Performance
Despite the expectation that cuDF (GPU-accelerated) should outperform the other libraries, it showed the slowest performance. Here are the potential reasons for this outcome:
Apache Arrow Columnar Memory Format:
Small Row Count:
Potential Bottlenecks:
Conclusion
Given these factors, it is reasonable to conclude that cuDF may not perform optimally for datasets with a small number of rows and a large number of columns in simple arithmetic operations. The overhead associated with data transfer and initialization on the GPU, combined with the columnar processing model, can outweigh the benefits of GPU acceleration in this specific context.
Is it okay to organize the reason why cudf is slow in the above scenario as above? I think there is something wrong or wrong, so I would really appreciate it if you could let me know.