oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
2.99k stars 180 forks source link

Deopt creating `Regexp` #3492

Closed nirvdrum closed 3 months ago

nirvdrum commented 4 months ago

@rwstauner and I spent some time tracking down a regex issue in a Rails app we’re running on TruffleRuby. It looks like a pathological case in regex caching. We have a Rails action that creates a new Regexp object via Regexp.union. When the Regexp is created TruffleRuby adds the RubyRegexp object to the RubyLanguage#regexpTable cache. Then when we create a TRegex object, we attach it to the cached RubyRegexp instance.

The problem is that RubyLanguage#regexpTable is a WeakValueCache and nothing is retaining a reference to these dynamically created RubyRegexp objects. That’s convenient for memory savings, but because the TRegex object is attached to the RubyRegexp object, when the RubyRegexp is purged from the cache we also lose the TRegex object. Since TRegex lazily builds its call target, this leads to a deopt when populating the local TRegex object cache. Consequently, we’re seeing a ton of deopts for the same Truffle::RegexpOperations.match_in_region split.

It looks like this could manifest in other ways as well. E.g., anywhere we call Truffle::Type.coerce_to_regexp (e.g., String#scan) could dynamically create a Regexp that will not be retained.

A contrived example that illustrates the problem is:

pairs = 1_000.times.map do |i|
  [/(re?)#{i}/, "str#{i}"]
end

100.times do
  pairs.each.with_index do |(re, str), i|
    Truffle::RegexpOperations.match_from(Regexp.union([re, /#{re.source.upcase}/]), str, 0)
  end
end

In reality, the loop we're executing comes from access the same page repeatedly in a load test. The "loop body" in this case is the Rails action.

I've discussed this with @eregon on the GraalVM Slack.

andrykonchin commented 3 months ago

Fixed in efcfd9836a0869c67b3fbfcd55360fde41259197

eregon commented 3 months ago

This fix should be included for the 24.0.1 Release (Apr 16, 2024). (and of course it's fixed on master and in truffleruby-dev/head)