ohler55 / oj

Optimized JSON
http://www.ohler.com/oj
MIT License
3.16k stars 252 forks source link

Oj Object Mode does not serialize ActiveSupport:TimeWithZone correctly #187

Closed neilgoodman closed 9 years ago

neilgoodman commented 10 years ago

This issue can be reproduced with the following script:

require 'oj'
require 'active_support/time'

now = ActiveSupport::TimeZone['America/Chicago'].now
oj_dump = Oj.load(Oj.dump(now, mode: :object), mode: :object)

# The time strings will be different
puts "Now: #{now}\nOj: #{oj_dump}"

Gem/Ruby versions:

Oj version: 2.10.2
ActiveSupport version: 4.1.6
Ruby version: 2.1.2p95

From what I can tell this is happening because of Oj's custom serialization for Ruby Time objects:

A "^t" JSON Object key indicates the value should be converted to a Ruby Time. The sequence {"^t":1325775487.000000} is read as Jan 5, 2012 at 23:58:07.

The problem with this is that only the absolute time is serialized, but Time#utc?is not.

There is a Time instance variable on ActiveSupport:TimeWithZone called @utc. @utc.utc? needs to return true, otherwise TZInfo::TimeOrDateTime (used by TimeWithZone internally) will recreate the Time object with Time.utc (https://github.com/tzinfo/tzinfo/blob/v1.2.2/lib/tzinfo/time_or_datetime.rb#L29) and then later call Time#to_time (https://github.com/tzinfo/tzinfo/blob/v1.2.2/lib/tzinfo/time_or_datetime.rb#L322), which causes the Time object to do its own timezone calculation.

A possible solution could be to extend Oj's object serialization of Time objects to include the Time#utc? value and then deserialize to a Time object so that Time#utc? returns appropriately.

ohler55 commented 10 years ago

Let me consider the options. I'll update later this week.

ohler55 commented 10 years ago

If serialized in :object mode the result is huge due to the links to the whole timezone object map. I can not extend Time to include utc as only the active support ActiveSupport::TimeWithZone includes that attribute. What I can do is change the object serialized format to include timezone in the exponent portion of the number so Jan 5, 2012 at 23:58:07 PDT becomes {"^t":1325775487.000000e-70}. The tiimezone offset would be in 10s of minutes. That preserves compatibility with older version and lets new version maintain the timezone.

As for the TimeWithZone, it would be best to provide a custom serializer but fixing the time would allow it to work as is even if the serialized output is huge.

neilgoodman commented 10 years ago

Maybe I am misunderstanding what you are saying about the utc attribute, but Time#utc? is core Ruby: http://www.ruby-doc.org/core-2.1.2/Time.html#method-i-utc-3F

Your solution sounds good.

I agree with the output size, it's way too large. Is there a way to define custom serializers using mode: :object in Oj? I know ActiveSupport::TimeWithZone defines a marshal_dump and a marshal_load method.

neilgoodman commented 10 years ago

I had to double check, but not all offsets are multiples of 10:

Asia/Kathmandu is +0545

neilgoodman commented 10 years ago

And how do you serialize + and - offsets with the exponent?

ohler55 commented 10 years ago

Ahh, regarding utc, I was referring to the @utc attribute of the ActiveSupport::TimeWithZone. I thought that was what you were referring to, not the utc?() method. As part of changing to use the expanded format I'll have Oj create time as gm instead of the ruby C call which creates local time. There will be a performance hit but it will come out as utc time then.

Glad you did more checking than I did. I only saw an offset of 30. I guess that approach will hav ego be refined. I think it is still valid if the offset are in minutes. So instead of the -70 I mentioned for 7 hours it would be -420. Utc time would have to be e0 or some reserved value for compatibility with older versions.

neilgoodman commented 10 years ago

Representing the offset in minutes sounds good. JavaScript's native timezone functions follow that same convention I believe. Using signed exponents will work fine. I don't know why I thought that would be an issue.

ohler55 commented 10 years ago

After some consideration I think using the string format would be better. Less chance of conflict with existing use. There is currently a time format option. If the time is set to xmlschema instead of unix then the string format will be used. Oj.load will then handle either one and the xmlschema string will include the timezone. If zero offset then the time will be utc.

neilgoodman commented 10 years ago

That sounds good. I tried to play around with the time_format option, but it appears to only work with mode: :compat. Would you be enabling the option for mode: :object?

ohler55 commented 10 years ago

The unit test was for the :object format. I have not even tried the compat mode.

neilgoodman commented 10 years ago

Ok, as long as :object respects the option now then this will work fine. Like I said above, :compat does respect the time_format option and I assume those tests are still passing.

ohler55 commented 10 years ago

I've only implement object mode with xmlschema and unix. The ruby mode has not been added. I'm still debating that. I suppose it does not hurt though based on how I ended up implementing the load.

neilgoodman commented 10 years ago

I'm missing something here. Can you post an example of how to use the xmlschema format with the :object mode? I tried to figure out how to do this before opening this issue, because I agree that it is a good solution. Here is code to illustrate what I mean:

require 'oj'

puts Oj.dump(Time.now, mode: :compat, time_format: :xmlschema) # Prints: "2014-09-24T11:41:16.927962000-05:00"
puts Oj.dump(Time.now, mode: :object, time_format: :xmlschema) # Prints: {"^t":1411576910.039681000}

But you seem to be implying there is another way to do this with :object?

ohler55 commented 10 years ago

That is the correct approach. I'll double check the test.

ohler55 commented 10 years ago

irb(main):001:0> puts Oj.dump(Time.now, mode: :compat, time_format: :xmlschema) "2014-09-24T20:15:23.790311000-07:00" => nil irb(main):002:0> puts Oj.dump(Time.now, mode: :object, time_format: :xmlschema) {"^t":"2014-09-24T20:15:32.605216000-07:00"} => nil

ohler55 commented 10 years ago

Release with this fix will be out tonight.

ohler55 commented 10 years ago

Please verify the problem has been corrected.

neilgoodman commented 10 years ago

Still having the issue with TimeWithZone:

require 'oj' # 2.10.4
require 'active_support/time' # 4.1.6

now = ActiveSupport::TimeZone['America/Chicago'].now
oj_dump = Oj.load(Oj.dump(now, mode: :object, time_format: :xmlschema), mode: :object, time_format: :xmlschema)

# The time strings will be different
puts "Now: #{now}\nOj: #{oj_dump}"

Output:

Now: 2014-10-24 08:29:10 -0500
Oj: 2014-10-24 03:29:10 -0500

However, it does look like the time_format: :xmlschema is working correctly for mode: :object:

require 'oj'

puts Oj.dump(Time.now, mode: :compat, time_format: :xmlschema)
puts Oj.dump(Time.now, mode: :object, time_format: :xmlschema)

Output:

"2014-10-24T08:34:04.549309000-05:00"
{"^t":"2014-10-24T08:34:04.549380000-05:00"}
ohler55 commented 10 years ago

ok, I will get it fixed.

ohler55 commented 10 years ago

The fix is going to be the addition of a ruby class called ActiveSupportHelper. It will include helper functions for handling ActiveSupport classes. The first being ActiveSupport::TimeWithZone. It will register routines using the Oj.register_odd() method.

ohler55 commented 10 years ago

I have a version on github if you would like to check it out.

ohler55 commented 10 years ago

I believe this is fixed. If I don't hear otherwise I'll close the issue.