oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.03k stars 185 forks source link

[Compatibility] Time.new should parse a string and error for missing time info #3693

Closed rwstauner closed 2 weeks ago

rwstauner commented 1 month ago

From https://bugs.ruby-lang.org/issues/19293 As of CRuby 3.2.3 Time.new can take a string, parse it, and error if time info is missing. The current behavior of TruffleRuby does not error, but worse, sets everything but the year to a default value:

$ RBENV_VERSION=truffleruby-dev ruby -rtime -e 'puts Time.new("2021-02-21")'
# note the month and day are lost
2021-01-01 00:00:00 -0700

$ RBENV_VERSION=3.3.5 ruby -rtime -e 'puts Time.new("2021-02-21")'
<internal:timev>:409:in `initialize': no time information (ArgumentError)
        from -e:1:in `new'
        from -e:1:in `<main>'

There are already many specs for this that are currently tagged as fails on TruffleRuby: https://github.com/oracle/truffleruby/blob/72fd3638619156f62219207b723ae348d9fc3038/spec/ruby/core/time/new_spec.rb#L544-L547

Unfortunately this behavior is relied upon in rails. You can see this with a simple rails setup:

rails new app --skip-action-cable
bin/rails g model Thing on:date at:datetime --force

This will produce a fixture file like so:

one:
  'on': 2024-10-17
  at: 2024-10-17 16:17:42

two:
  'on': 2024-10-17
  at: 2024-10-17 16:17:42

Then printing out the records in a test is enough to see the problem:

require "test_helper"

class ThingTest < ActiveSupport::TestCase
  test "the truth" do
    p things(:one)
    p things(:two)
  end
end

CRuby:

#<Thing id: 980190962, on: "2024-10-18", at: "2024-10-18 12:20:33.000000000 +0000", created_at: "2024-10-18 19:22:14.443886000 +0000", updated_at: "2024-10-18 19:22:14.443886000 +0000">
#<Thing id: 298486374, on: "2024-10-18", at: "2024-10-18 12:20:33.000000000 +0000", created_at: "2024-10-18 19:22:14.443886000 +0000", updated_at: "2024-10-18 19:22:14.443886000 +0000">

TruffleRuby:

#<Thing id: 980190962, on: "2024-10-18", at: "2024-01-01 00:00:00.000000000 +0000", created_at: "2024-01-01 00:00:00.000000000 +0000", updated_at: "2024-01-01 00:00:00.000000000 +0000">
#<Thing id: 298486374, on: "2024-10-18", at: "2024-01-01 00:00:00.000000000 +0000", created_at: "2024-01-01 00:00:00.000000000 +0000", updated_at: "2024-01-01 00:00:00.000000000 +0000">
rwstauner commented 1 month ago

Looking at the ArgumentError specs, it seems like it generally parses the string one unit at a time, and if the current token doesn't match the criteria the error message contains the remainder of the string

eregon commented 1 month ago

The current behavior of TruffleRuby does not error, but worse, sets everything but the year to a default value:

Yes, TruffleRuby has the same behavior as CRuby 3.0 for this, because these changes for Time.new have not been implemented yet (see https://github.com/oracle/truffleruby/issues/3039).

Is there any chance you or @nirvdrum would have time to implement this?

There was an initial version of that change which used a Regexp and looked pretty simple, I think we should try that: https://github.com/ruby/ruby/pull/4639/files

The version that ended up being merged is quite a bit more complicated by parsing everything manually: https://github.com/ruby/ruby/pull/4825/files

But there are also more changes due to https://bugs.ruby-lang.org/issues/19293 so it's worth checking ruby 3.3.5 sources: https://github.com/ruby/ruby/blob/v3_3_5/time.c#L2541

Either way I think we should implement this in Ruby code (not Java or C).

eregon commented 1 month ago

TruffleRuby should implement this behavior, but I half wonder if Rails should really use such a poorly-designed method which is incompatible between Ruby versions, it seems Rails main has to workaround some of it anyway: https://github.com/rails/rails/blob/main/activemodel/lib/active_model/type/helpers/time_value.rb#L72 And the old logic was pretty simple and straightforward: https://github.com/rails/rails/blob/c49b8270/activemodel/lib/active_model/type/helpers/time_value.rb#L72-L86

I suppose the performance difference makes it worth it on CRuby? There would probably be no advantage on TruffleRuby to use Time.new over a Regexp, Time.new might be even slower as it has more complicated semantics.

eregon commented 1 month ago

FWIW there seems to be a bunch of bugs in Time.new(String), I tried to link them to https://bugs.ruby-lang.org/issues/18033. Notably https://bugs.ruby-lang.org/issues/20797#change-110163 which won't even be backported to 3.2 (I'm not 100% sure it applies to Time.new(String) too but it seems likely).