Open magnus-ekman opened 3 months ago
Hi @magnus-ekman ,
Thank you for the report. This is an issue with cudf
when we try to access the scalar values from a column. They are inherently slower when compared to pandas
. Here is an example:
# Pandas
In [1]: import pandas as pd
In [2]: s = pd.Series([10, 1, 2, 3, 4, 5])
In [3]: %timeit s[2]
4.73 μs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
# cudf
In [1]: import cudf
In [2]: s = cudf.Series([10, 1, 2, 3, 4, 5])
In [3]: %timeit s[2]
1.66 ms ± 1.71 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
This slow-down is being amplified in your example. This is something we at Nvidia are actively working on to alleviate.
However, as a temporary workaround you can disable using GPU for an instruction using this:
from cudf.pandas.module_accelerator import disable_module_accelerator
with disable_module_accelerator():
# your pandas code
Thanks. I have a (perhaps silly) question on the workaround that is related to this slowdown. When I work in a Jupyter notebook, I like to simply type "df" in a cell and execute the cell to get the DataFrame printed in a nicely formatted way. Doing so is super slow with cudf. If I try to apply your suggested workaround, I don't get a print-out. It works if I instead do "print(df)", but it will not be as nicely formatted. Any ideas of how to solve this?
@magnus-ekman I think that issue with showing df
might be the same as #15747.
@galipremsagar Maybe we can work on accelerating the fancy repr in the nearer term, since it should be easier to solve than the broader problem of scalar access.
Describe the bug I have a case where I loop through each element in a dataframe and call a function for each element. When running with cudf.pandas, this takes on the order of 100x longer time than when running with just pandas. I recognize that best practices is to write vectorized functions but there are cases where it is just easier to loop through each element. I don't expect speedup compared to the non-cudf implementation but it would be good if there wasn't a huge slowdown.
Steps/Code to reproduce bug Code run in a Jupyter notebook:
Expected behavior When running without cudf this takes 60ms. When running with cudf it takes 10 seconds. I would expect performance with cudf to be comparable to performance without cudf.
Environment overview (please complete the following information) -Bare-metal -PIP install
Environment details Not sure where to find that script. Here are my basic setup: Platform: x86 + A100 GPU. Ubuntu 22.04.4 LTS cuDF: Name: cudf-cu12 Version: 24.6.1 CUDA: Cuda compilation tools, release 12.3, V12.3.107 Python: Python 3.10.12 Running in a Jupyter notebook
Additional context Add any other context about the problem here.