modularml / mojo

The Mojo Programming Language
https://docs.modular.com/mojo
Other
22.08k stars 2.54k forks source link

[BUG] Flaky segfault during `mojo build` with `-D MOJO_ENABLE_ASSERTIONS` #2751

Open gabrieldemarmiesse opened 1 month ago

gabrieldemarmiesse commented 1 month ago

Bug description

This bug is a blocker for https://github.com/modularml/mojo/issues/2687

When compiling test_string.mojo with -D MOJO_ENABLE_ASSERTIONS I noticed that I got some flaky segfaults.

It's reproducible in the CI as you can see here: https://github.com/modularml/mojo/actions/runs/9139790714/job/25132482734?pr=2718#step:9:86

But I also tried to make a minimal reproducible example. It's hard as the more code is removed, the more likely it is that the build will succeeds.

Here is the type of output I get:

[17257:17257:20240519,125524.823840:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[17257:17257:20240519,125524.823897:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
Please submit a bug report to https://github.com/modularml/mojo/issues and include the crash backtrace along with all the relevant source codes.
Stack dump:
0.      Program arguments: mojo build -D MOJO_ENABLE_ASSERTIONS trying_stuff2.mojo
 #0 0x000056519f3e3838 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x1293838)
 #1 0x000056519f3e165e (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x129165e)
 #2 0x000056519f3e3ecd (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x1293ecd)
 #3 0x00007f77a2723520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00005651a06fbaf3 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25abaf3)
 #5 0x00005651a11befde (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x306efde)
 #6 0x00005651a11c00f9 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x30700f9)
 #7 0x00005651a06fc07e (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25ac07e)
 #8 0x00005651a06fd2f4 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25ad2f4)
 #9 0x00005651a06fc6b0 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25ac6b0)
#10 0x00005651a0da0e0d (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x2c50e0d)
#11 0x000056519f7de1c2 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x168e1c2)
#12 0x00005651a06fe486 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25ae486)
#13 0x00005651a06f21fd (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25a21fd)
#14 0x00005651a06f1c12 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x25a1c12)
#15 0x00005651a0b78191 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x2a28191)
#16 0x00005651a11b0e80 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x3060e80)
#17 0x00005651a11b0f79 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x3060f79)
#18 0x00005651a0b8f709 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x2a3f709)
#19 0x00005651a0b76694 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x2a26694)
#20 0x000056519f8935f5 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x17435f5)
#21 0x000056519f31d0da (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x11cd0da)
#22 0x000056519f31b9a4 (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x11cb9a4)
#23 0x00007f77a270ad90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#24 0x00007f77a270ae40 __libc_start_main (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#25 0x000056519f31b32e (/root/.modular/pkg/packages.modular.com_nightly_mojo/bin/mojo+0x11cb32e)
mojo crashed!
Please file a bug report.

Steps to reproduce

Use nightly. Download the following gist and put it in a file: https://gist.github.com/gabrieldemarmiesse/bdbb73ba9672c071a99e6f13abe1e8e4

Then run the following command (possibly multiple times since it's flaky):

rm -rf $HOME/.modular/.mojo_cache && mojo build -D MOJO_ENABLE_ASSERTIONS trying_stuff.mojo

It's important to delete the cache before every compilation as when the build succeeds, then the cache will be used and the segfault will not appear again.

On my computers (I tested with two of them) I have a 50% failure rate. The CI can reproduce the error too.

If you want a higher failure rate, intead of copying my gist, copy the content of this file: https://github.com/modularml/mojo/blob/826b2bd2cb9fa27258cce9245d0d50824de12e18/stdlib/test/builtin/test_string.mojo

When I compile this file, I have a failure rate of 90%.

System information

debian bookworm/sid, in docker, inside wsl 2
modular 0.7.2 (d0adc668)
mojo 2024.5.1905 (46b7e7ee)

It can also be reproduced in the CI, and the log file I used earlier was from a macOS build. So it seems to be on all operating systems.
JoeLoser commented 1 month ago

This internally hits an LLVM assert:

Assertion failed: (Iter != this->end() && "DenseMap::at failed due to a missing key"), function at, file DenseMap.h, line 213.

It's coming from M::KGEN::Elaborator::lookupConcreteFunction FYI @Mogball

Mogball commented 1 month ago

debug_assert is playing with fire with its dependencies on String. This likely is another case of a crash instead of an error message

gabrieldemarmiesse commented 3 weeks ago

While we do have enabled all assertions in the CI, it's not enabled for test_string.mojo and as such, when the bug is resolved, we should not forget to enable assertions there too.