Closed roshkins closed 2 years ago
The undefined behavior in the kernel due to the reverse guard appears to be overwriting the input data buffer when it occurs. We'll take a look into what's happening here, but "protecting" against undefined behavior in user code is challenging.
Hopefully the examples and documentation below resolve your issue.
df = cudf.DataFrame([[1,4], [2,5],[3,6]], columns= ['col1', 'col2'])
df = df.repeat(10)
df.reset_index(drop=True, inplace=True)
out_df = cudf.DataFrame()
print(df.head())
for i in range(5):
res = df.apply_chunks(test_fn,
incols={"col2": 'in_col'},
outcols={'out_col': np.float64},
chunks=len(df[col])
)
print(res.head())
col1 col2
0 1 4
1 1 4
2 1 4
3 1 4
4 1 4
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
col1 col2 out_col
0 4607182418800017408 4629137466983448576 4.607182e+18
1 0 4607182418800017408 4.607182e+18
2 0 4611686018427387904 4.613938e+18
3 0 4613937818241073152 4.613938e+18
4 0 4616189618054758400 0.000000e+00
col1 col2 out_col
0 4607182418800017408 4629137466983448576 4.607182e+18
1 0 4607182418800017408 4.607182e+18
2 0 4611686018427387904 4.613938e+18
3 0 4613937818241073152 4.613938e+18
4 0 4616189618054758400 0.000000e+00
col1 col2 out_col
0 4607182418800017408 4629137466983448576 4.607182e+18
1 0 4607182418800017408 4.607182e+18
2 0 4611686018427387904 4.613938e+18
3 0 4613937818241073152 4.613938e+18
4 0 4616189618054758400 0.000000e+00
col1 col2 out_col
0 4607182418800017408 4629137466983448576 4.607182e+18
1 0 4607182418800017408 4.607182e+18
2 0 4611686018427387904 4.613938e+18
3 0 4613937818241073152 4.613938e+18
4 0 4616189618054758400 0.000000e+00
With a thread boundary guard to ensure we don't access memory beyond the buffer, I don't see any undefined behavior.
def test_fn2(in_col, out_col):
# Thread id in a 1D block
tx = cuda.threadIdx.x
# Block id in a 1D grid
ty = cuda.blockIdx.x
# Block width, i.e. number of threads per block
bw = cuda.blockDim.x
# Compute flattened index inside the array
row_idx = tx + ty * bw
if row_idx < in_col.size: # thread boundary guard
if row_idx % 2 == 0:
out_col[row_idx] = in_col[row_idx + 1]
else:
out_col[row_idx] = in_col[row_idx]
df = cudf.DataFrame([[1,4], [2,5],[3,6]], columns= ['col1', 'col2'])
df = df.repeat(10)
df.reset_index(drop=True, inplace=True)
out_df = cudf.DataFrame()
print(df.head())
for i in range(5):
res = df.apply_chunks(test_fn2,
incols={"col2": 'in_col'},
outcols={'out_col': np.float64},
chunks=len(df[col])
)
print(res.head())
col1 col2
0 1 4
1 1 4
2 1 4
3 1 4
4 1 4
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
col1 col2 out_col
0 1 4 4.0
1 1 4 4.0
2 1 4 4.0
3 1 4 4.0
4 1 4 4.0
Note that you could express this test function as a rolling window UDF as well:
def func(window):
current = window[0]
if current % 2 == 0:
return window[1] # assume at least window size 2
else:
return current
df.col2.rolling(2).apply(func)
We're also actively enhancing the user experience for UDFs. We've added df.apply
and "row" abstractions, similar to pandas. You may be interested in this section of our nightly documentation: https://docs.rapids.ai/api/cudf/nightly/user_guide/guide-to-udfs.html#Generalized-NA-Support
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
This issue has been labeled inactive-90d
due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
I'm going to close this because users writing apply functions are effectively writing kernels, and we just don't have mechanisms for making all such cases safe.
Describe the bug A clear and concise description of what the bug is. When I reference an adjacent cell in apply_chunks, the result is non-deterministic when executed repeatedly.
Steps/Code to reproduce bug Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.
Expected behavior A clear and concise description of what you expected to happen. Should always output a column that has the computed data, rather than sporadically a column of zeros.
Environment overview (please complete the following information)
docker pull
&docker run
commands usedEnvironment details Please run and paste the output of the
cudf/print_env.sh
script here, to gather any other relevant environment detailsClick here to see environment details
Additional context Add any other context about the problem here.