Open Sjors opened 2 months ago
You re-ran the same task on the same commit on the same machine 3 hours later and it passed: https://cirrus-ci.com/task/6619444124844032?logs=ci#L313 vs https://cirrus-ci.com/task/5557228785106944?logs=ci#L311
Did you change anything in between?
Also, probably unrelated, but if you want, you can test https://github.com/bitcoin/bitcoin/pull/30639 and https://github.com/bitcoin/bitcoin/pull/30634
@maflcko yes, I first reproduced the issue and then tested the workaround vm.mmap_rnd_bits=28
. See https://github.com/Sjors/bitcoin/pull/51.
I'll try those clang-19 PRs. If that fixes the issue then presumably the issue is in llvm and they should consider backporting additional commits. But if it doesn't then maybe the problem is on our side (even though it's trivial to work around).
I see. So in theory it should be reproducible by setting up a vanilla Ubuntu 24.04 (or later) host to run the CI tasks. I guess no one has done so yet, given that you are the first one to observe the issue. However, if it is reproducible, then it probably should be fixed.
@maflcko clang 19 fixes neither, see https://github.com/Sjors/bitcoin/pull/59.
https://github.com/llvm/llvm-project/commit/7d039effc4930be9240446a4241d268a39960e0b only added two bits 28->30, so a failure with 32 is still expected, unless I am missing something.
The Cirrus CI on my fork of the repo runs on Ubuntu 24.04 with kernel version 6.8.0-38. This has
vm.mmap_rnd_bits=32
set, which causes the TSAN and MSAN jobs to fail.See:
TSAN: https://cirrus-ci.com/task/6619444124844032
MSAN: https://cirrus-ci.com/task/4578750543691776
This job was from mid July. Just in case I reproduced it against todays master: https://github.com/Sjors/bitcoin/pull/57 / https://cirrus-ci.com/task/4886869396160512
My (limited) understanding is that the underlying issue should have been fixed and the fix has been backported to llvm 18.1.3: https://github.com/google/sanitizers/issues/1614#issuecomment-2010316781
Ubuntu 24.04 has shipped that version since early July:https://launchpad.net/ubuntu/noble/amd64/clang-18
I can see in the CI log this this version was indeed used:
Although I can trivially work around the issue by setting
vm.mmap_rnd_bits=28
, perhaps there is a deeper issue worth investigating.Possibly related: https://github.com/ClickHouse/ClickHouse/issues/64086 (they also tried 18.1.3 and 18.1.6).