rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8k stars 868 forks source link

[BUG] Hang in libcudf replace_multiple #16152

Open lithomas1 opened 3 days ago

lithomas1 commented 3 days ago

Describe the bug A clear and concise description of what the bug is.

libcudf hangs in replace multiple when the targets column contains an empty string.

Steps/Code to reproduce bug Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.

Using pylibcudf,

import pyarrow as pa
import cudf._lib.pylibcudf as plc

input_array = pa.array([
    "a",
    "b",
    "c",
    "d",
    "e",
    "f",
    "g",
    "h",
])

data = pa.array([
    "AbC",
    "de",
    "FGHI",
    "j",
    "kLm",
    "nOPq",
    "",
    "RsT",
])

print('hang starts now!')

plc.strings.replace.replace_multiple(
    plc.interop.from_arrow(input_array), plc.interop.from_arrow(data), plc.interop.from_arrow(input_array)
)

Expected behavior A clear and concise description of what you expected to happen.

Environment overview (please complete the following information)

Environment details Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context Add any other context about the problem here.

Backtrace (sorry, I didn't have the time to figure out how to compile libcudf in debug mode, but hopefully this helps)

I got it by pressing ctrl+c in gdb after "hang starts now" is printed

``` #8 0x00007fff3ec70898 in cudaStreamSynchronize () from /home/coder/.conda/envs/rapids/lib/libcudart.so.12 #9 0x00007fff189a2a78 in cudf::detail::sizes_to_offsets_iterator thrust::cuda_cub::detail::exclusive_scan_n_impl, thrust::cuda_cub::execute_on_stream_base>, thrust::transform_iterator >, long> (*)(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>), &(std::pair >, long> cudf::strings::detail::make_offsets_child_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>)), 1u>, int*, int const> >, thrust::counting_iterator, thrust::use_default, thrust::use_default>, long, cudf::detail::sizes_to_offsets_iterator, long, thrust::plus >(thrust::cuda_cub::execution_policy, thrust::cuda_cub::execute_on_stream_base> >&, thrust::transform_iterator >, long> (*)(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>), &(std::pair >, long> cudf::strings::detail::make_offsets_chil--Type for more, q to quit, c to continue without paging-- d_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>)), 1u>, int*, int const> >, thrust::counting_iterator, thrust::use_default, thrust::use_default>, long, cudf::detail::sizes_to_offsets_iterator, long, thrust::plus) [clone .isra.0] () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so #10 0x00007fff189a70d4 in auto cudf::detail::sizes_to_offsets >, long> (*)(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>), &(std::pair >, long> cudf::strings::detail::make_offsets_child_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>)), 1u>, int*, int const> >, thrust::counting_iterator, thrust::use_default, thrust::use_default>, int*>(thrust::transform_iterator >, long> (*)(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>), &(std::pair >, long> cudf::strings::detail::make_offsets_child_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>)), 1u>, int*, int const> >, thrust::counting_iterator, thrust::use_default, thrust::use_default>, thrust::transform_iterator >, long> (*)(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>), &(std::pair >, long> cudf::strings::detail::make_offsets_child_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>)), 1u>, int*, int const> >, thrust::counting_iterator, thrust::use_default, thrust::use_default>, int*, rmm::cuda_stream_view) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so #11 0x00007fff189abb9e in std::pair >, long> cudf::strings::detail::make_offsets_child_column(int*, int*, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so #12 0x00007fff1a14d243 in auto cudf::strings::detail::make_strings_children(cudf::strings::detail::(anonymous namespace)::replace_multi_fn, int, int, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so --Type for more, q to quit, c to continue without paging-- #13 0x00007fff1a1418ef in cudf::strings::detail::(anonymous namespace)::replace_string_parallel(cudf::strings_column_view const&, cudf::strings_column_view const&, cudf::strings_column_view const&, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so #14 0x00007fff1a14536f in cudf::strings::detail::replace_multiple(cudf::strings_column_view const&, cudf::strings_column_view const&, cudf::strings_column_view const&, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so #15 0x00007fff1a14547a in cudf::strings::replace_multiple(cudf::strings_column_view const&, cudf::strings_column_view const&, cudf::strings_column_view const&, rmm::cuda_stream_view, cuda::mr::__4::basic_resource_ref<(cuda::mr::__4::_AllocType)1, cuda::mr::__4::device_accessible>) () from /home/coder/cudf/cpp/build/conda/cuda-12.2/release/libcudf.so ```
davidwendt commented 2 days ago

Empty targets for cudf::strings::replace_multiple are not supported. I'll open a PR to fix the hang and update the doxygen.