rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.23k stars 884 forks source link

[BUG] memcheck and racecheck errors in avro reader with `codec="deflate"` #15216

Open wence- opened 6 months ago

wence- commented 6 months ago

Describe the bug

import cudf
import fastavro
import io

total_rows = num_rows = rows_per_block = 2048
total_bytes_per_block = rows_per_block * 7

records = [{"0": f"{i:0>6}"} for i in range(total_rows)]
schema = {
    "name": "root",
    "type": "record",
    "fields": [
        {"name": "0", "type": "string"},
    ],
}

buffer = io.BytesIO()
fastavro.writer(buffer, schema, records, sync_interval=total_bytes_per_block, codec="deflate")
buffer.seek(0)

actual_df = cudf.read_avro(buffer, skiprows=0, num_rows=num_rows)

Extracted from test_avro_reader_fastavro_integration.py::test_avro_reader_multiblock.

Neither

compute-sanitizer --tool=memcheck python bug.py

nor

compute-sanitizer --tool=racecheck python bug.py

are clean.

Exemplar stack traces:

memcheck ``` ========= COMPUTE-SANITIZER ========= Invalid __global__ read of size 1 bytes ========= at 0x2080 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:807:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= by thread (32,0,0) in block (0,0,0) ========= Address 0x7f6078604cb3 is out of bounds ========= and is 2,356 bytes after the nearest allocation at 0x7f6078601600 of size 11,648 bytes ========= Device Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1109:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [0x6050] ========= Saved host backtrace up to driver entry point at kernel launch time ========= Host Frame: [0x332470] ========= in /usr/lib/x86_64-linux-gnu/libcuda.so.1 ========= Host Frame: [0x14fb4] ========= in /home/coder/.conda/envs/rapids/lib/libcudart.so.12 ========= Host Frame:cudaLaunchKernel [0x70aae] ========= in /home/coder/.conda/envs/rapids/lib/libcudart.so.12 ========= Host Frame:/home/coder/.conda/envs/rapids/targets/x86_64-linux/include/cuda_runtime.h:216:cudaError cudaLaunchKernel(char const*, dim3, dim3, void**, unsigned long, CUstream_st*) [0x12a5605] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/tmp/tmpxft_0003da43_00000000-6_gpuinflate.compute_90.cudafe1.stub.c:1:__device_stub__ZN4cudf2io14inflate_kernelILi128EEEvNS_11device_spanIKNS2_IKhLm18446744073709551615EEELm18446744073709551615EEENS2_IKNS2_IhLm18446744073709551615EEELm18446744073709551615EEENS2_INS0_18compression_resultELm18446744073709551615EEENS0_20gzip_header_includedE(cudf::device_span const, 18446744073709551615ul>&, cudf::device_span const, 18446744073709551615ul>&, cudf::device_span&, cudf::io::gzip_header_included) [0x12a4de6] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/tmp/tmpxft_0003da43_00000000-6_gpuinflate.compute_90.cudafe1.stub.c:4:void cudf::io::__wrapper__device_stub_inflate_kernel<128>(cudf::device_span const, 18446744073709551615ul>&, cudf::device_span const, 18446744073709551615ul>&, cudf::device_span&, cudf::io::gzip_header_included&) [0x12a4e1e] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1145:void cudf::io::inflate_kernel<128>(cudf::device_span const, 18446744073709551615ul>, cudf::device_span const, 18446744073709551615ul>, cudf::device_span, cudf::io::gzip_header_included) [0x12a5598] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1214:cudf::io::gpuinflate(cudf::device_span const, 18446744073709551615ul>, cudf::device_span const, 18446744073709551615ul>, cudf::device_span, cudf::io::gzip_header_included, rmm::cuda_stream_view) [0x12a49ef] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/home/coder/cudf/cpp/src/io/avro/reader_impl.cu:227:cudf::io::detail::avro::decompress_data(cudf::io::datasource&, cudf::io::detail::avro::metadata&, rmm::device_buffer const&, rmm::cuda_stream_view) [0x123db3c] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:/home/coder/cudf/cpp/src/io/avro/reader_impl.cu:528:cudf::io::detail::avro::read_avro(std::unique_ptr >&&, cudf::io::avro_reader_options const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*) [0x123fa1f] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame:cudf::io::read_avro(cudf::io::avro_reader_options const&, rmm::mr::device_memory_resource*) [0x13019ee] ========= in /home/coder/cudf/cpp/build/release/libcudf.so ========= Host Frame: [0x2ba3c] ========= in /home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/_lib/avro.cpython-310-x86_64-linux-gnu.so ========= Host Frame: [0x2d29f] ========= in /home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/_lib/avro.cpython-310-x86_64-linux-gnu.so ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:4181:_PyEval_EvalFrameDefault [0x139022] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Objects/call.c:342:_PyFunction_Vectorcall [0x1448cc] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:4231:_PyEval_EvalFrameDefault [0x1357dc] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:5067:_PyEval_Vector [0x1d7870] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:1135:PyEval_EvalCode [0x1d77b7] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1292:run_eval_code_obj [0x207d1a] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1313:run_mod [0x203123] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1208:pyrun_file.cold [0x9a4d1] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:456:_PyRun_SimpleFileObject [0x1fd60e] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:90:_PyRun_AnyFileObject [0x1fd1a4] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Modules/main.c:670:Py_RunMain [0x1fa39b] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame:/usr/local/src/conda/python-3.10.13/Modules/main.c:1091:Py_BytesMain [0x1cae17] ========= in /home/coder/.conda/envs/rapids/bin/python ========= Host Frame: [0x29d90] ========= in /usr/lib/x86_64-linux-gnu/libc.so.6 ========= Host Frame:__libc_start_main [0x29e40] ========= in /usr/lib/x86_64-linux-gnu/libc.so.6 ========= Host Frame: [0x1cad11] ========= in /home/coder/.conda/envs/rapids/bin/python ========= ```
racecheck ``` ========= COMPUTE-SANITIZER ========= Error: Race reported between Read access at 0xe00 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:789:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x1930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:543:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16132 hazards] ========= and Write access at 0x5660 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:661:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16156 hazards] ========= ========= Error: Race reported between Write access at 0xd90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:957:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0x33c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:590:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1144 hazards] ========= and Read access at 0x5250 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:642:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [6592 hazards] ========= ========= Error: Race reported between Read access at 0x810 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:954:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Write access at 0x59c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:665:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1032 hazards] ========= ========= Error: Race reported between Read access at 0xa70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:784:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x5930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:663:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1028 hazards] ========= and Write access at 0x5f90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:671:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0x11c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:793:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Read access at 0xf90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:523:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [500 hazards] ========= and Read access at 0x5dd0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:670:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0xf60 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:962:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0xdb0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:522:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [272 hazards] ========= ========= Error: Race reported between Write access at 0x5d70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1104:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) ========= and Read access at 0x5d0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:951:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) [8 hazards] ========= ========= Warning: Race reported between Read access at 0x3b0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:775:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x3000 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:826:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) [8 hazards] ========= ========= Warning: Race reported between Read access at 0x31a0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1068:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) ========= and Write access at 0x4900 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1081:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [4 hazards] ========= ========= Error: Race reported between Read access at 0xe00 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:789:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x1930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:543:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16132 hazards] ========= and Write access at 0x5660 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:661:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16156 hazards] ========= ========= Error: Race reported between Write access at 0xd90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:957:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0x33c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:590:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1144 hazards] ========= and Read access at 0x5250 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:642:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [6592 hazards] ========= ========= Error: Race reported between Read access at 0x810 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:954:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Write access at 0x59c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:665:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1032 hazards] ========= ========= Error: Race reported between Read access at 0xa70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:784:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x5930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:663:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1028 hazards] ========= and Write access at 0x5f90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:671:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0x11c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:793:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Read access at 0xf90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:523:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [500 hazards] ========= and Read access at 0x5dd0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:670:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0xf60 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:962:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0xdb0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:522:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [272 hazards] ========= ========= Error: Race reported between Write access at 0x5d70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1104:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) ========= and Read access at 0x5d0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:951:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) [8 hazards] ========= ========= Warning: Race reported between Read access at 0x3b0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:775:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x3000 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:826:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) [8 hazards] ========= ========= Warning: Race reported between Read access at 0x31a0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1068:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) ========= and Write access at 0x4900 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1081:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [4 hazards] ========= ========= RACECHECK SUMMARY: 18 hazards displayed (14 errors, 4 warnings) ```

I do not know if the racecheck warnings are as problematic as the memcheck ones, gpuinflate.cu is littered with volatile accesses to the inter-warp communication queue without (AFAICT) any synchronisation, but perhaps there are enough spin-waits that it is "OK"?

vyasr commented 3 months ago

I can't reproduce the memcheck error right now. I do see the same racecheck issues though.