oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3k stars 183 forks source link

"malloc(): smallbin double linked list corrupted" when testing re2 #2262

Open mudge opened 3 years ago

mudge commented 3 years ago

@gogainda suggested I report the following build error when trying to test re2 against TruffleRuby HEAD:

*** Error in `/home/runner/.rubies/truffleruby-head/bin/truffleruby': malloc(): smallbin double linked list corrupted: 0x000000000b4ce570 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7f7ed5ce97f5]
/lib/x86_64-linux-gnu/libc.so.6(+0x82679)[0x7f7ed5cf4679]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f7ed5cf61d4]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znwm+0x15)[0x7f7e67a48055]
/usr/lib/libre2.so.0(+0x2c095)[0x7f7e67db1095]
/usr/lib/libre2.so.0(_ZN3re23RE24InitERKNS_11StringPieceERKNS0_7OptionsE+0x23f)[0x7f7e67dc871f]
/usr/lib/libre2.so.0(_ZN3re23RE2C2EPKc+0xd9)[0x7f7e67dc9b79]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x3a2f225]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x3a2cdf4]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x58ddf5]

This is using ruby/setup-ruby@v1 with truffleruby-head on ubuntu-16.04 and attempting to run the re2 test suite after compiling it against release 2015-05-01 of the underlying Google re2 library.

I've tried this by building every ABI version of libre2-dev and tried using the default Ubuntu package for 16.04 but I consistently see malloc errors, e.g. memory corruption:

*** Error in `/home/runner/.rubies/truffleruby-head/bin/truffleruby': malloc(): memory corruption: 0x000000000bb99780 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7fdb884ae7f5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8215e)[0x7fdb884b915e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7fdb884bb1d4]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znwm+0x15)[0x7fdb18fb5055]
/usr/lib/libre2.so.9(+0x30d97)[0x7fdb19322d97]
/usr/lib/libre2.so.9(_ZN3re23RE24InitERKNS_11StringPieceERKNS0_7OptionsE+0x194)[0x7fdb19339d54]
/usr/lib/libre2.so.9(_ZN3re23RE2C1EPKc+0xf5)[0x7fdb1933af55]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x3a2f225]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x3a2cdf4]
/home/runner/.rubies/truffleruby-head/bin/truffleruby[0x58ddf5]

Both seem to refer to the RE2::Init function which takes a StringPiece& pattern and Options& options.

Strangely, the re2 test suite passes locally on macOS 11.2.1 using truffleruby 21.0.0 from ruby-build and re2 20210202 from Homebrew:

~/Projects/re2> bundle exec rake clean spec
mkdir -p tmp/x86_64-darwin18/re2/2.7.2
cd tmp/x86_64-darwin18/re2/2.7.2
/Users/mudge/.rubies/truffleruby-21.0.0/bin/truffleruby -I. ../../../../ext/re2/extconf.rb
checking for -lstdc++... yes
checking for stdint.h... yes
checking for rb_str_sublen()... no
checking for -lre2... yes
checking for re2 requires C++11 compiler... no
checking for RE2::Match() with endpos argument... yes
creating Makefile
cd -
cd tmp/x86_64-darwin18/re2/2.7.2
/usr/bin/make
compiling ../../../../ext/re2/re2.cc
In file included from ../../../../ext/re2/re2.cc:33:
/Users/mudge/.rubies/truffleruby-21.0.0/lib/cext/include/ruby/encoding.h:354:1: warning: 
      empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
struct rb_econv_t {};
^
1 warning generated.
linking shared-object re2.bundle
ld: warning: dylib (/usr/local/lib/libre2.dylib) was built for newer macOS version (11.0) than being linked (10.16)
ld: warning: dylib (/usr/local/lib/libre2.dylib) was built for newer macOS version (11.0) than being linked (10.16)
cd -
mkdir -p tmp/x86_64-darwin18/stage/lib
install -c tmp/x86_64-darwin18/re2/2.7.2/re2.bundle lib/re2.bundle
cp tmp/x86_64-darwin18/re2/2.7.2/re2.bundle tmp/x86_64-darwin18/stage/lib/re2.bundle
/Users/mudge/.rubies/truffleruby-21.0.0/bin/truffleruby -I/Users/mudge/.gem/truffleruby/2.7.2/gems/rspec-core-3.10.1/lib:/Users/mudge/.gem/truffleruby/2.7.2/gems/rspec-support-3.10.2/lib /Users/mudge/.gem/truffleruby/2.7.2/gems/rspec-core-3.10.1/exe/rspec --pattern spec/\*\*\{,/\*/\*\*\}/\*_spec.rb
Run options: include {:focus=>true}

All examples were filtered out; ignoring {:focus=>true}

Randomized with seed 54516
........................................................../Users/mudge/Projects/re2/spec/re2/regexp_spec.rb:84: warning: constant ::Fixnum is deprecated
................................................................................................

Finished in 0.627 seconds (files took 1.55 seconds to load)
154 examples, 0 failures

Randomized with seed 54516
eregon commented 3 years ago

That's odd, as we can see in the C backtrace it's libre2 calling malloc, so it's hard to understand how TruffleRuby can influence this, but since it passes on CRuby there must be something, maybe in Sulong.

Could you try running on Ubuntu 18.04 or 20.04? 16.04 is quite old. Would also be worth trying TruffleRuby 21.0.0 in CI just to make sure it's not some kind of regression.

mudge commented 3 years ago

@eregon I've now updated the job to run the test suite against truffleruby-21.0.0 and truffleruby-head on both Ubuntu 18.04 and Ubuntu 20.04.

All combinations still fail but the error messages might be more informative, e.g.

mudge commented 3 years ago

I've also added macOS 10.15 to the build so you can see it passes for both truffleruby-head and truffleruby-21.0.0.

mudge commented 3 years ago

For consistency, I've updated the build to test against the same version of the underlying re2 library as the successful macOS job, compiled in the same way as the Homebrew formula (see also https://github.com/mudge/re2-ci/pull/1) and now see the following errors:

eregon commented 3 years ago

I wonder if re2 somehow uses a custom malloc() or so, which might lead to bad interactions with the default system malloc(). All the errors above are really from some malloc implementation, not from TruffleRuby itself.

mudge commented 3 years ago

I’m not sure if it is relevant but there’s discussion of re2 using a “clever trick” with uninitialized memory (via https://stackoverflow.com/questions/47653565/undefined-behaviour-in-re2-which-stated-to-be-well-defined), specifically in https://github.com/google/re2/blob/master/re2/sparse_array.h and https://github.com/google/re2/blob/master/re2/sparse_set.h

andrykonchin commented 2 months ago

Now on TruffleRuby 24.2.0-dev the specs fail at loading re2.so (PR):

/home/runner/.rubies/truffleruby-head/bin/truffleruby -I/home/runner/work/re2/re2/vendor/bundle/truffleruby/3.2.2.13/gems/rspec-core-3.13.0/lib:/home/runner/work/re2/re2/vendor/bundle/truffleruby/3.2.2.13/gems/rspec-support-3.13.1/lib /home/runner/work/re2/re2/vendor/bundle/truffleruby/3.2.2.13/gems/rspec-core-3.13.0/exe/rspec --pattern spec/\*\*\{,/\*/\*\*\}/\*_spec.rb
[ruby] WARNING the --debug-frozen-string-literal switch is silently ignored as it is an internal development tool
warning: The native extension at /home/runner/.rubies/truffleruby-head/lib/gems/gems/re2-2.12.0-x86_64-linux/lib/3.2/re2.so has a different ABI version: nil than the running TruffleRuby: "3.2.2.13"

An error occurred while loading spec_helper. - Did you mean?
                    rspec ./spec/spec_helper.rb

Failure/Error: require 're2.so'

LoadError:
  cannot load such file -- re2.so
# ./lib/re2.rb:15:in `<top (required)>'
# ./spec/spec_helper.rb:3:in `<top (required)>'
# ------------------
# --- Caused by: ---
# LoadError:
#   The native extension at /home/runner/.rubies/truffleruby-head/lib/gems/gems/re2-2.12.0-x86_64-linux/lib/3.2/re2.so has a different ABI version: nil than the running TruffleRuby: "3.2.2.13"
#   /home/runner/.rubies/truffleruby-head/lib/truffle/truffle/cext.rb:2[38](https://github.com/andrykonchin/re2/actions/runs/9612453606/job/26513233160?pr=1#step:5:39):in `check_abi_version'
eregon commented 2 months ago

The native extension at /home/runner/.rubies/truffleruby-head/lib/gems/gems/re2-2.12.0-x86_64-linux/lib/3.2/re2.so has a different ABI version: nil than the running TruffleRuby: "3.2.2.13"

Means that the rb_tr_abi_version symbol/function is not correctly kept in re2.so.