Closed aabizri closed 4 years ago
When testing for the error, I didn't correctly check that the error didn't come from the implementation (I didn't correctly switch to intel-ocl-sdk
when I though I did). After trying again I didn't have the error on intel-ocl-sdk
so it seems that it is a beignet
bug, I will be reporting it there. Closing the issue.
EDIT
When testing for the error, I didn't correctly check that the error didn't come from the implementation (I didn't correctly switch to
intel-ocl-sdk
when I though I did). After trying again I didn't have the error onintel-ocl-sdk
so it seems that it is abeignet
bug, I will be reporting it there. Closing the issue.Summary
On
ocl 0.19
, when trying to build aContext
(or aProQue
) in two concurrent threads, aSIGABRT double free
orSIGSEGV
error is triggered. On a single thread there's no bug.As
OpenCL
functions since1.1
are all thread-safe except forclSetKernelArg()
, this is not because this is undefined behavior as per the spec. ~When tested against bothbeignet
andintel-ocl-sdk
, I got the same errors, indicating it doesn't come from the particular implementation. It is thus highly probable the error comes fromocl
.~Tested both on
stable
(rustc 1.37.0 (eae3437df 2019-08-13)
)nightly
(rustc 1.39.0-nightly (96d07e0ac 2019-09-15)
)Error & debugging
On
SIGABRT
these are the messages printed, in decreasing frequency of occurrence:corrupted size vs. prev_size
(by a wide margin the most common)double free or corruption (!prev)
double free or corruption (out)
double free or corruption (fasttop)
clang (LLVM option parsing): for the -memdep-block-scan-limit option: may only occur zero or one times!
but I'm not sure it's linkedOn
SIGSEGV
no debug messages are printed. Rarely (one in 20 tries I would say), the sample program doesn't error out.As the error comes from the memory side, *debugging with
MALLOC_CHECK_=1
(or2
) restricts the errors to eitherSIGSEGV
orSIGABRT
withfree(): invalid pointer
as message.When debugging with GDB, the error always occurred when in
ocl-core::retain_context
orocl-core::retain_mem_object
.Reproduction
I have been able to reduce the reproduction to the following code:
Run with
cargo test -- --test-threads=2
to trigger the error, andcargo test -- --test-threads=1
to see that it isn't triggered in single-threaded situations.