rubyjs / mini_racer

Minimal embedded v8
MIT License
598 stars 93 forks source link

0.4.0 crashes on aarch64-linux for Ruby 2.7 + 3.0; LoadError with 2.5 and 2.6 (via Docker for Mac on M1) #190

Closed tisba closed 2 years ago

tisba commented 3 years ago

I used the following to check if mini_racer works with Docker for Mac on Apple Silicon. RUBY_PLATFORM is aarch64-linux.

# frozen_string_literal: true

require "bundler/inline"

gemfile do
  source "https://rubygems.org"

  gem "mini_racer", "0.4.0"
  gem "libv8-node", "15.14.0.1"
end

puts "RUBY_VERSION : #{RUBY_VERSION}"
puts "RUBY_PLATFORM: #{RUBY_PLATFORM}"
puts "MiniRacer::LIBV8_NODE_VERSION: #{MiniRacer::LIBV8_NODE_VERSION}"
puts "Libv8::Node::VERSION: #{Libv8::Node::VERSION}"
puts "Libv8::Node::NODE_VERSION: #{Libv8::Node::NODE_VERSION}"
puts "Libv8::Node::LIBV8_VERSION: #{Libv8::Node::LIBV8_VERSION}"

ctx = MiniRacer::Context.new
ctx.eval("1+1")

I'm on Docker for Mac 3.3.3 (64133). For testing, I run the above script like this:

docker run -it --rm -v "$(pwd)":/app ruby:2.7.2 ruby /app/minimal.rb

Ruby 3

Ruby 2.7

Ruby 2.7.0, 2.7.1, 2.7.2 and 2.7.3 results in *** stack smashing detected ***: <unknown> terminated. Not sure how to supply more detail here.

Ruby 2.6 and 2.5

Ruby 2.5.8 and 2.6.7 gives a LoadError:

/usr/local/bundle/gems/libv8-node-15.12.0.0.beta1-aarch64-linux/lib/libv8/node.rb:2:in `require': cannot load such file -- libv8-node/location (LoadError)
        from /usr/local/bundle/gems/libv8-node-15.12.0.0.beta1-aarch64-linux/lib/libv8/node.rb:2:in `<top (required)>'
        from /usr/local/bundle/gems/libv8-node-15.12.0.0.beta1-aarch64-linux/lib/libv8-node.rb:1:in `require'
        from /usr/local/bundle/gems/libv8-node-15.12.0.0.beta1-aarch64-linux/lib/libv8-node.rb:1:in `<top (required)>'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:81:in `require'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:81:in `block (2 levels) in require'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:76:in `each'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:76:in `block in require'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:65:in `each'
        from /usr/local/lib/ruby/2.6.0/bundler/runtime.rb:65:in `require'
        from /usr/local/lib/ruby/2.6.0/bundler/inline.rb:70:in `gemfile'
        from /app/minimal.rb:8:in `<main>'

mini_racer currently supports Ruby >= 2.3. I tested all available versions of 2.7 and 3, but only the latest one of 2.6 and 2.5. One could consider making 2.5 the minimal supported version, as 2.3 and 2.4 are EOL for a while now. Ruby version requirement has been increased to >= 2.5.

I'm happy to provide more data if it helps. I also have a larger project with substantial use of mini_racer I'm planning to test with M1 hardware, but there are several other pieces currently blocking me from running the test suite.

lloeki commented 3 years ago

Thanks for the detailed report.

On the building side I've encountered this failure when building in a Linux on M1:

  g++ -o /code/src/node-15.14.0/out/Release/obj.target/v8_base_without_compiler/deps/v8/src/heap/base/asm/arm64/push_registers_asm.o ../deps/v8/src/heap/base/asm/arm64/push_registers_asm.cc '-DV8_GYP_BUILD' '-DV8_TYPED_ARRAY_MAX_SIZE_IN_HEAP=64' '-DV8_COMPRESS_POINTERS' '-DV8_31BIT_SMIS_ON_64BIT_ARCH' '-D__STDC_FORMAT_MACROS' '-DOPENSSL_THREADS' '-DOPENSSL_NO_ASM' '-DV8_TARGET_ARCH_ARM64' '-DV8_EMBEDDER_STRING="-node.28"' '-DENABLE_DISASSEMBLER' '-DV8_PROMISE_INTERNAL_FIELD_COUNT=1' '-DENABLE_MINOR_MC' '-DOBJECT_PRINT' '-DV8_INTL_SUPPORT' '-DV8_CONCURRENT_MARKING' '-DV8_ENABLE_LAZY_SOURCE_POSITIONS' '-DV8_USE_SIPHASH' '-DDISABLE_UNTRUSTED_CODE_MITIGATIONS' '-DV8_WIN64_UNWINDING_INFO' '-DV8_ENABLE_REGEXP_INTERPRETER_THREADED_DISPATCH' '-DV8_SNAPSHOT_COMPRESSION' '-DICU_UTIL_DATA_IMPL=ICU_UTIL_DATA_STATIC' '-DUCONFIG_NO_SERVICE=1' '-DU_ENABLE_DYLOAD=0' '-DU_STATIC_IMPLEMENTATION=1' '-DU_HAVE_STD_STRING=1' '-DUCONFIG_NO_BREAK_ITERATION=0' -I../deps/v8 -I../deps/v8/include -I/code/src/node-15.14.0/out/Release/obj/gen/inspector-generated-output-root -I../deps/v8/third_party/inspector_protocol -I/code/src/node-15.14.0/out/Release/obj/gen/torque-output-root -I/code/src/node-15.14.0/out/Release/obj/gen/generate-bytecode-output-root -I../deps/icu-small/source/i18n -I../deps/icu-small/source/common -I../deps/v8/third_party/zlib -I../deps/v8/third_party/zlib/google  -pthread -Wno-unused-parameter -fPIC -Wno-return-type -fno-strict-aliasing -O3 -fno-omit-frame-pointer -fdata-sections -ffunction-sections -O3 -fno-rtti -fno-exceptions -std=gnu++1y -MMD -MF /code/src/node-15.14.0/out/Release/.deps//code/src/node-15.14.0/out/Release/obj.target/v8_base_without_compiler/deps/v8/src/heap/base/asm/arm64/push_registers_asm.o.d.raw   -c
/tmp/ccJ6FZKe.s: Assembler messages:
/tmp/ccJ6FZKe.s:14: Error: operand 1 must be a floating-point register -- `stp fp,lr,[sp,#-16]!'
/tmp/ccJ6FZKe.s:15: Error: operand 1 must be an integer register -- `mov fp,sp'
/tmp/ccJ6FZKe.s:19: Error: operand 1 must be an integer register -- `ldr lr,[sp,#8]'
/tmp/ccJ6FZKe.s:20: Error: operand 1 must be an integer register -- `ldr fp,[sp],#96'
tools/v8_gypfiles/v8_base_without_compiler.target.mk:734: recipe for target '/code/src/node-15.14.0/out/Release/obj.target/v8_base_without_compiler/deps/v8/src/heap/base/asm/arm64/push_registers_asm.o' failed

Strangely sometimes it goes through.

I would not be surprised if that build-in-a-VM trickery had consequences.

I'm going to try cross-compiling to arm on x86, it should give more reliable/portable results and work in CI as well.

tisba commented 3 years ago

Looks like @SamSaffron changed the required Ruby version to be >= 2.5 👍 So we're dealing with some V8 related issues on 2.7.x and some other issues on 2.5.x and 2.6.x.

lloeki commented 3 years ago

One could consider making 2.5 the minimal supported version, as 2.3 and 2.4 are EOL for a while now.

FYI I'll be progressively trying to upstream as much of the improvements we did on sq_mini_racer as possible (like fast utf8 validation via SIMD, as well as other interesting stuff we are prototyping, can't say too much but it's going to be hot) to minimise or eliminate the fork.

One of the requirements we have over here though is old Ruby version support (down to 2.0 currently) which for mini_racer is not too hard... the hard part being the extension building and binary distribution, which is largely independent of the Ruby version but has much to do with the various OSes and base images people are using around, and needs solving anyway. So instead of keeping it in a painful fork I'd rather upstream it as well.

SamSaffron commented 3 years ago

I am a very strong no on any support for EOL Rubies. I don't support EOL Rubies on any of my projects.

I open to keep old support if we must to avoid a fork @lloeki but would prefer to stay just with supported Rubies.

lloeki commented 3 years ago

Thank you @SamSaffron for keeping that door ajar.

That said, I'm totally with you, I'd rather have support of non-EOL rubies only in mini_racer, and I'm thinking about alternative solutions to keep it that way.

It's still a long shot anyway, as I'll need to upstream the improvements first to minimise the drift.

SamSaffron commented 3 years ago

@lloeki I noticed that our CI is complaining pretty loud about "Stack Smashed" errors and various beasts. Is there anything we can do to make the CI situation better?

I don't think I saw a totally green build from CI in a while.

lloeki commented 3 years ago

@SamSaffron yeah, I've seen that, it's been grilling me for a while.

arm64 Linux fails on GHA's x86+qemu@arm64 with the stack smash error but as I recall succeeds on Travis (which runs on graviton arm64 hardware IIRC). This might be due to either a qemu bug for that arch, me compiling on arm64 Linux in a VM atop Darwin/M1 which picks up a specific feature but the qemu CPU lacking that feature while graviton has it, or a compiler toolchain issue that would be fixed in a more recent version. FWIW the compile-time assembly error I previously commented on was because of a too old compiler.

There's a 3.0-alpine/x64 failure that I've been trying to make sense for a while, I think something changed in between 2.7-alpine and 3.0-alpine that makes it behave differently (at some point in the 3.0-preview it was completely broken so as to produce a blank CXX var which would just break everything). Not sure if it's coming from upstream Ruby 3, upstream Alpine, this specific Docker-stamped image, or a combination.

For the first one I'll first try to build with a newer GCC or switch to clang and see the result.

For the second one I have to figure out exactly what's going wrong with that 3.0-alpine image.

tisba commented 3 years ago

For reference: The issue still persists with mini_racer=0.4.0 and libv8-node=15.14.0.1.

I'm wondering if the LoadError for Ruby 2.5.8 and 2.6.7 is at all related to mini_racer.

lloeki commented 3 years ago

It's most probably not. I think the LoadError comes from a failure by Ruby to load the .so with the root cause being the problematic libv8-node on aarch64, it just manifests itself differently than the stack smashing (which loads but then proceeds to burst into flames).

tisba commented 3 years ago

Little update here: I have had a couple of *** stack smashing detected ***: <unknown> terminated on Ruby 3.0.1p64 on aarch64-linux.

I cannot reproduce it with the example from the PR description unfortunately 😞 I'll see if I can create a minimal example to that this better reproducible.

cyri113 commented 3 years ago

Hello, we are having a similar issue with *** stack smashing detected ***: <unknown> terminated.

is there any update on this?

Running:

chrisalley commented 3 years ago

Running rails assets:precompile with mini_racer 0.4.0 in the Gemfile caused the following malloc(): invalid size (unsorted) error:

 => ERROR [14/14] RUN RAILS_ENV=production SECRET_KEY_BASE=abcd1234 rails assets:precompile -  2.8s
------
 > [14/14] RUN RAILS_ENV=production SECRET_KEY_BASE=abcd1234 rails assets:precompile --trace:
#18 0.821 ** Invoke assets:precompile (first_time)
#18 0.821 ** Invoke assets:environment (first_time)
#18 0.821 ** Execute assets:environment
#18 0.821 ** Invoke environment (first_time)
#18 0.821 ** Execute environment
#18 0.973 ** Invoke yarn:install (first_time)
#18 0.973 ** Execute yarn:install
#18 2.328 yarn install v1.22.5
#18 2.348 [1/4] Resolving packages...
#18 2.350 [2/4] Fetching packages...
#18 2.352 [3/4] Linking dependencies...
#18 2.355 [4/4] Building fresh packages...
#18 2.358 Done in 0.03s.
#18 2.369 ** Execute assets:precompile
#18 2.690 malloc(): invalid size (unsorted)
#18 2.734 Aborted
------
executor failed running [/bin/sh -c RAILS_ENV=production SECRET_KEY_BASE=abcd1234 rails assets:precompile --trace]: exit code: 134
ERROR: Service 'app' failed to build : Build failed

However, this error only occurs when running rails assets:precompile as part of a Docker build when the host OS is running on Apple Silicon. The same command works fine on an Intel Mac (inside and outside of a Docker build) with mini_racer still included in the Gemfile.

As a workaround, I have commented out mini_racer in the Gemfile and installed node/yarn separately earlier in the Dockerfile:

FROM ruby:3.0.1

...

RUN apt-get update -y && apt-get install -y --no-install-recommends \
  build-essential

# Install Yarn
RUN wget --quiet -O - /tmp/pubkey.gpg https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
RUN echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
RUN apt update && apt install -y yarn

...

RUN gem install bundler --version=2.2.20
RUN bundle install

...

# Precompile assets
RUN rails assets:clobber
RUN RAILS_ENV=production SECRET_KEY_BASE=abcd1234 rails assets:precompile --trace
lloeki commented 3 years ago

Thanks for the additional report.

As far as I could gather information, the issue indeed seems to only affect virtualised ARM (which is what Docker for Mac does), and appears to run OK on ARM Darwin and on Graviton instances.

At this stage, the why remains elusive. I need to try cross-compiling to a given ARM CPU target instead of building within a virtualised environment.

lloeki commented 3 years ago

I'm tracking progress of ARM support at the libv8-node repo. Short status update: cross-compiling to aarch64 on x86_64 is promising and appears OK, but I run into a dumb missing symbol error (v8::V8::SetFlagsFromString(char const*, unsigned long)) when testing.

lloeki commented 3 years ago

I now have a (manually) cross-compiled libv8-node 16.3.0 gem over here, it works for me but I'd rather not publish it on rubygems yet because I have to persist and automate the host=x86_64/target=aarch64 cross-compiling changes in CI.

If anyone is up to try it, I can provide the gem as well as the small diff of changes for mini_racer to support libv8-node 16.3.0 (I'll do a PR around shortly)

tisba commented 3 years ago

Awesome! I'm happy to give it a try (on my mac Mini M1) both on the minimal example in the PR description as well as a private repo where we use tons of mini_racer in app logic and tests.

lloeki commented 3 years ago

@tisba here goes https://www.dropbox.com/s/s0ip9f3fmxuy4x0/libv8-node-16.3.0.0-aarch64-linux.gem?dl=0

tisba commented 3 years ago

It first I was confused, because it did resolve the dependency correctly, but I assume you want me to test this with https://github.com/rubyjs/mini_racer/tree/libv8-node-16, right?

I put your .gem file into a local gem repository (created via gem generate_index) and used the inline Gemfile:

gemfile do
  source "https://rubygems.org"

  gem "mini_racer", git: "https://github.com/rubyjs/mini_racer", branch: "libv8-node-16"
  gem "libv8-node", source: "file:///app/repo"
end

The rest of the script is like in the PR description. And it looks good :)

tisba commented 3 years ago

Interestingly https://github.com/rubyjs/mini_racer/issues/170 seems also to be resolved. At least I cannot reproduce the problem with this build.

I need a bit more time to make that work against my large private project, but so far this looks very promising!

lloeki commented 3 years ago

but I assume you want me to test this with https://github.com/rubyjs/mini_racer/tree/libv8-node-16, right?

Correct! I forgot to put the link up here, sorry 🙏

Interestingly #170 seems also to be resolved.

Hmm, I do know about the issue cause but only have cursory knowledge of how the issue is supposed to be addressed, but there might be some additional improvements as to how V8 9.0 handles forks on top of single threaded mode.

tisba commented 3 years ago

Interestingly #170 seems also to be resolved.

Hmm, I do know about the issue cause but only have cursory knowledge of how the issue is supposed to be addressed, but there might be some additional improvements as to how V8 9.0 handles forks on top of single threaded mode.

I'll keep an eye on it, let's first make sure that the aarch64 builds work as expected 😉

tisba commented 3 years ago

to keep track, see: https://github.com/rubyjs/mini_racer/pull/210

tisba commented 3 years ago

Quick update related to this issue using mini_racer=0.5.0.pre

I've successfully re-run my test on M1 with Docker Desktop with

gem "mini_racer", "0.5.0.pre"
gem "libv8-node", "16.10.0.0"

against these Rubies:

They all ran to completion 🥳

So it looks like, we could – once 0.5.0 gets released – add a remark to the readme that 0.4.0 has some issues on aarch and 0.5.0 should be used.

SamSaffron commented 3 years ago

This is great news, may bump the version this week

On Wed, 10 Nov 2021 at 7:25 pm, Sebastian Cohnen @.***> wrote:

Quick update related to this issue using mini_racer=0.5.0.pre

I've successfully re-run my test on M1 with Docker Desktop with

gem "mini_racer", "0.5.0.pre" gem "libv8-node", "16.10.0.0"

against these Rubies:

  • 3.0.0, 3.0.1, 3.0.2
  • 2.7.3
  • 2.6.7

They all ran to completion 🥳

So it looks like, we could – once 0.5.0 gets released – add a remark to the readme that 0.4.0 has some issues on aarch and 0.5.0 should be used.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rubyjs/mini_racer/issues/190#issuecomment-964889102, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABIXL4FE6EOFWWGYU373DULIUATANCNFSM42WVGG4Q .

pandaiolo commented 2 years ago

@tisba thanks! Works nice for me as well.

One thing though, is that while it pass on M1, it breaks on Circle CI, whith the following output:

Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

current directory:
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/gems/mini_racer-0.5.0.pre/ext/mini_racer_extension
/usr/local/bin/ruby -I /usr/local/lib/ruby/2.6.0 -r
./siteconf20211201-526-162if6l.rb extconf.rb
checking for -lpthread... yes
creating Makefile

current directory:
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/gems/mini_racer-0.5.0.pre/ext/mini_racer_extension
make "DESTDIR=" clean

current directory:
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/gems/mini_racer-0.5.0.pre/ext/mini_racer_extension
make "DESTDIR="
compiling mini_racer_extension.cc
cc1plus: warning: command-line option ‘-Wimplicit-int’ is valid for C/ObjC but
not for C++
cc1plus: note: unrecognized command-line option ‘-Wno-self-assign’ may have been
intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option ‘-Wno-parentheses-equality’ may
have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option ‘-Wno-constant-logical-operand’
may have been intended to silence earlier diagnostics
linking shared-object mini_racer_extension.so
g++: error:
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/gems/libv8-node-16.10.0.0-x86_64-linux/vendor/v8/x86_64-linux/libv8/obj/libv8_monolith.a:
No such file or directory
make: *** [Makefile:262: mini_racer_extension.so] Error 1

make failed, exit code 2

Gem files will remain installed in
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/gems/mini_racer-0.5.0.pre for
inspection.
Results logged to
/home/circleci/bw_rails/vendor/bundle/ruby/2.6.0/extensions/x86_64-linux/2.6.0/mini_racer-0.5.0.pre/gem_make.out

An error occurred while installing mini_racer (0.5.0.pre), and Bundler

I'm not familiar with those, so I'm not sure if there's an easy workaround?

Edit: would that be related to an obsolete toolchain in the CI image?

From the readme:

Note using v8.h and compiling MiniRacer requires a C++11 standard compiler, more specifically clang 3.5 (or later) or gcc 4.8 (or later).

tisba commented 2 years ago

To wrap this up: The issue is resolved with mini_racer=0.6.0 and libv8-node=16.10.0.0.

I tested on M1 Mac Mini macOS 12.1, using Docker Desktop. So RUBY_PLATFORM is aarch64-linux.

Script used for testing:

# frozen_string_literal: true

# Run with:
#   docker run -it --rm -v "$(pwd)":/app ruby:2.7.3 ruby /app/minimal.rb

require "bundler/inline"

gemfile do
  source "https://rubygems.org"

  gem "mini_racer", "0.6.0"
  gem "libv8-node", "16.10.0.0"
end

puts "RUBY_VERSION : #{RUBY_VERSION}"
puts "RUBY_PLATFORM: #{RUBY_PLATFORM}"
puts "MiniRacer::LIBV8_NODE_VERSION: #{MiniRacer::LIBV8_NODE_VERSION}"
puts "Libv8::Node::VERSION: #{Libv8::Node::VERSION}"
puts "Libv8::Node::NODE_VERSION: #{Libv8::Node::NODE_VERSION}"
puts "Libv8::Node::LIBV8_VERSION: #{Libv8::Node::LIBV8_VERSION}"

ctx = MiniRacer::Context.new
ctx.eval("1+1")

Tested with:

tisba commented 2 years ago

@pandaiolo I'd suggest you open a new issue to make it a bit easier for maintainers to keep track.

chrisalley commented 2 years ago

I can confirm that the rails assets:precompile command works with mini_racer 0.6.0 on an M1 MacBook Air, using Docker Desktop.

I've created a seperate issue for the CircleCI problem: #227