Extracted from test_avro_reader_fastavro_integration.py::test_avro_reader_multiblock.
Neither
compute-sanitizer --tool=memcheck python bug.py
nor
compute-sanitizer --tool=racecheck python bug.py
are clean.
Exemplar stack traces:
memcheck
```
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 1 bytes
========= at 0x2080 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:807:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= by thread (32,0,0) in block (0,0,0)
========= Address 0x7f6078604cb3 is out of bounds
========= and is 2,356 bytes after the nearest allocation at 0x7f6078601600 of size 11,648 bytes
========= Device Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1109:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [0x6050]
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame: [0x332470]
========= in /usr/lib/x86_64-linux-gnu/libcuda.so.1
========= Host Frame: [0x14fb4]
========= in /home/coder/.conda/envs/rapids/lib/libcudart.so.12
========= Host Frame:cudaLaunchKernel [0x70aae]
========= in /home/coder/.conda/envs/rapids/lib/libcudart.so.12
========= Host Frame:/home/coder/.conda/envs/rapids/targets/x86_64-linux/include/cuda_runtime.h:216:cudaError cudaLaunchKernel(char const*, dim3, dim3, void**, unsigned long, CUstream_st*) [0x12a5605]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/tmp/tmpxft_0003da43_00000000-6_gpuinflate.compute_90.cudafe1.stub.c:1:__device_stub__ZN4cudf2io14inflate_kernelILi128EEEvNS_11device_spanIKNS2_IKhLm18446744073709551615EEELm18446744073709551615EEENS2_IKNS2_IhLm18446744073709551615EEELm18446744073709551615EEENS2_INS0_18compression_resultELm18446744073709551615EEENS0_20gzip_header_includedE(cudf::device_span const, 18446744073709551615ul>&, cudf::device_span const, 18446744073709551615ul>&, cudf::device_span&, cudf::io::gzip_header_included) [0x12a4de6]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/tmp/tmpxft_0003da43_00000000-6_gpuinflate.compute_90.cudafe1.stub.c:4:void cudf::io::__wrapper__device_stub_inflate_kernel<128>(cudf::device_span const, 18446744073709551615ul>&, cudf::device_span const, 18446744073709551615ul>&, cudf::device_span&, cudf::io::gzip_header_included&) [0x12a4e1e]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1145:void cudf::io::inflate_kernel<128>(cudf::device_span const, 18446744073709551615ul>, cudf::device_span const, 18446744073709551615ul>, cudf::device_span, cudf::io::gzip_header_included) [0x12a5598]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1214:cudf::io::gpuinflate(cudf::device_span const, 18446744073709551615ul>, cudf::device_span const, 18446744073709551615ul>, cudf::device_span, cudf::io::gzip_header_included, rmm::cuda_stream_view) [0x12a49ef]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/home/coder/cudf/cpp/src/io/avro/reader_impl.cu:227:cudf::io::detail::avro::decompress_data(cudf::io::datasource&, cudf::io::detail::avro::metadata&, rmm::device_buffer const&, rmm::cuda_stream_view) [0x123db3c]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:/home/coder/cudf/cpp/src/io/avro/reader_impl.cu:528:cudf::io::detail::avro::read_avro(std::unique_ptr >&&, cudf::io::avro_reader_options const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*) [0x123fa1f]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame:cudf::io::read_avro(cudf::io::avro_reader_options const&, rmm::mr::device_memory_resource*) [0x13019ee]
========= in /home/coder/cudf/cpp/build/release/libcudf.so
========= Host Frame: [0x2ba3c]
========= in /home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/_lib/avro.cpython-310-x86_64-linux-gnu.so
========= Host Frame: [0x2d29f]
========= in /home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/_lib/avro.cpython-310-x86_64-linux-gnu.so
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:4181:_PyEval_EvalFrameDefault [0x139022]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Objects/call.c:342:_PyFunction_Vectorcall [0x1448cc]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:4231:_PyEval_EvalFrameDefault [0x1357dc]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:5067:_PyEval_Vector [0x1d7870]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/ceval.c:1135:PyEval_EvalCode [0x1d77b7]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1292:run_eval_code_obj [0x207d1a]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1313:run_mod [0x203123]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:1208:pyrun_file.cold [0x9a4d1]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:456:_PyRun_SimpleFileObject [0x1fd60e]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Python/pythonrun.c:90:_PyRun_AnyFileObject [0x1fd1a4]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Modules/main.c:670:Py_RunMain [0x1fa39b]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame:/usr/local/src/conda/python-3.10.13/Modules/main.c:1091:Py_BytesMain [0x1cae17]
========= in /home/coder/.conda/envs/rapids/bin/python
========= Host Frame: [0x29d90]
========= in /usr/lib/x86_64-linux-gnu/libc.so.6
========= Host Frame:__libc_start_main [0x29e40]
========= in /usr/lib/x86_64-linux-gnu/libc.so.6
========= Host Frame: [0x1cad11]
========= in /home/coder/.conda/envs/rapids/bin/python
=========
```
racecheck
```
========= COMPUTE-SANITIZER
========= Error: Race reported between Read access at 0xe00 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:789:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x1930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:543:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16132 hazards]
========= and Write access at 0x5660 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:661:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16156 hazards]
=========
========= Error: Race reported between Write access at 0xd90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:957:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Read access at 0x33c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:590:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1144 hazards]
========= and Read access at 0x5250 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:642:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [6592 hazards]
=========
========= Error: Race reported between Read access at 0x810 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:954:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Write access at 0x59c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:665:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1032 hazards]
=========
========= Error: Race reported between Read access at 0xa70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:784:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x5930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:663:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1028 hazards]
========= and Write access at 0x5f90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:671:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards]
=========
========= Error: Race reported between Write access at 0x11c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:793:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Read access at 0xf90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:523:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [500 hazards]
========= and Read access at 0x5dd0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:670:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards]
=========
========= Error: Race reported between Write access at 0xf60 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:962:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Read access at 0xdb0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:522:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [272 hazards]
=========
========= Error: Race reported between Write access at 0x5d70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1104:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included)
========= and Read access at 0x5d0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:951:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) [8 hazards]
=========
========= Warning: Race reported between Read access at 0x3b0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:775:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x3000 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:826:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) [8 hazards]
=========
========= Warning: Race reported between Read access at 0x31a0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1068:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included)
========= and Write access at 0x4900 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1081:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [4 hazards]
=========
========= Error: Race reported between Read access at 0xe00 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:789:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x1930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:543:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16132 hazards]
========= and Write access at 0x5660 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:661:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16156 hazards]
=========
========= Error: Race reported between Write access at 0xd90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:957:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Read access at 0x33c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:590:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1144 hazards]
========= and Read access at 0x5250 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:642:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [6592 hazards]
=========
========= Error: Race reported between Read access at 0x810 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:954:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Write access at 0x59c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:665:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1032 hazards]
=========
========= Error: Race reported between Read access at 0xa70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:784:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x5930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:663:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1028 hazards]
========= and Write access at 0x5f90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:671:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards]
=========
========= Error: Race reported between Write access at 0x11c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:793:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Read access at 0xf90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:523:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [500 hazards]
========= and Read access at 0x5dd0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:670:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards]
=========
========= Error: Race reported between Write access at 0xf60 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:962:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int)
========= and Read access at 0xdb0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:522:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [272 hazards]
=========
========= Error: Race reported between Write access at 0x5d70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1104:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included)
========= and Read access at 0x5d0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:951:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) [8 hazards]
=========
========= Warning: Race reported between Read access at 0x3b0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:775:cudf::io::process_symbols(cudf::io::inflate_state_s *, int)
========= and Write access at 0x3000 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:826:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) [8 hazards]
=========
========= Warning: Race reported between Read access at 0x31a0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1068:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included)
========= and Write access at 0x4900 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1081:void cudf::io::inflate_kernel<(int)128>(cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, (unsigned long)18446744073709551615>, cudf::device_span, cudf::io::gzip_header_included) [4 hazards]
=========
========= RACECHECK SUMMARY: 18 hazards displayed (14 errors, 4 warnings)
```
I do not know if the racecheck warnings are as problematic as the memcheck ones, gpuinflate.cu is littered with volatile accesses to the inter-warp communication queue without (AFAICT) any synchronisation, but perhaps there are enough spin-waits that it is "OK"?
Describe the bug
Extracted from
test_avro_reader_fastavro_integration.py::test_avro_reader_multiblock
.Neither
nor
are clean.
Exemplar stack traces:
memcheck
``` ========= COMPUTE-SANITIZER ========= Invalid __global__ read of size 1 bytes ========= at 0x2080 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:807:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= by thread (32,0,0) in block (0,0,0) ========= Address 0x7f6078604cb3 is out of bounds ========= and is 2,356 bytes after the nearest allocation at 0x7f6078601600 of size 11,648 bytes ========= Device Frame:/home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1109:void cudf::io::inflate_kernel<(int)128>(cudf::device_spanracecheck
``` ========= COMPUTE-SANITIZER ========= Error: Race reported between Read access at 0xe00 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:789:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x1930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:543:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16132 hazards] ========= and Write access at 0x5660 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:661:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [16156 hazards] ========= ========= Error: Race reported between Write access at 0xd90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:957:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0x33c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:590:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1144 hazards] ========= and Read access at 0x5250 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:642:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [6592 hazards] ========= ========= Error: Race reported between Read access at 0x810 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:954:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Write access at 0x59c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:665:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1032 hazards] ========= ========= Error: Race reported between Read access at 0xa70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:784:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Write access at 0x5930 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:663:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [1028 hazards] ========= and Write access at 0x5f90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:671:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0x11c0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:793:cudf::io::process_symbols(cudf::io::inflate_state_s *, int) ========= and Read access at 0xf90 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:523:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [500 hazards] ========= and Read access at 0x5dd0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:670:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [4 hazards] ========= ========= Error: Race reported between Write access at 0xf60 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:962:cudf::io::prefetch_warp(volatile cudf::io::inflate_state_s *, int) ========= and Read access at 0xdb0 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:522:cudf::io::decode_symbols(cudf::io::inflate_state_s *) [272 hazards] ========= ========= Error: Race reported between Write access at 0x5d70 in /home/coder/cudf/cpp/src/io/comp/gpuinflate.cu:1104:void cudf::io::inflate_kernel<(int)128>(cudf::device_spanI do not know if the racecheck warnings are as problematic as the memcheck ones,
gpuinflate.cu
is littered withvolatile
accesses to the inter-warp communication queue without (AFAICT) any synchronisation, but perhaps there are enough spin-waits that it is "OK"?