interagent / pliny

An opinionated toolkit for writing excellent APIs in Ruby.
MIT License
802 stars 73 forks source link

Gem "oj" 3.0 breaks "encode" #316

Closed fdr closed 7 years ago

fdr commented 7 years ago

I've noticed a fairly large behavioral change in oj that breaks a number of old once-working code. It seems to take place in the changes for 3.0.

Among the causalities is encode, even when passed a hash in the normal way in the serializers. I am using the newest released pliny.

For example, it used to be that Sequel models would be automatically serialized as their values. Now the to_s representation gets serialized (so, sequel_model_object.values is the fix).

I suspect there is a similar assumption somewhere, e.g. maybe reliance on to_h or to_hash, that is no longer advised.

brandur commented 7 years ago

Would it be correct to say that the root of the problem probably has something to do with previously implicit serialization of Sequel models?

fdr commented 7 years ago

I don't think so, because my broken encode expressions involve hashes.

fdr commented 7 years ago

Although, maybe.

fdr commented 7 years ago

No, I think you are right. I thought I caught one where it was just a hash, but nope. It's probably all root-caused by serialization of Sequel models.

brandur commented 7 years ago

Ah nice. Yeah, I only ask because it would be surprising if oj broke serialization of hashes containing only arrays/hashes and primitive types — that would have to be considered a big bug I think. Breaking undefined behavior around encoding more complex objects would be totally plausible though.

fdr commented 7 years ago

On Fri, Oct 27, 2017 at 4:59 PM Brandur notifications@github.com wrote:

Ah nice. Yeah, I only ask because it would be surprising if oj broke serialization of hashes containing only arrays/hashes and primitive types — that would have to be considered a big bug I think. Breaking undefined behavior around encoding more complex objects would be totally plausible though.

Actually, I take it back. Something is amiss. I'll dig in.

fdr commented 7 years ago

Took a while, but here it is. https://github.com/ohler55/oj/issues/446

brandur commented 7 years ago

Nice!

I'll leave it up to the maintainers over there to do what they will, but general comment: it's probably not a good idea to rely upon undefined behavior in a library like oj — serializing a complex object for example. You and I both know that serializing a JSONBHash as if it were just a normal hash is a pretty sensible thing to do, but it would be a boundaries violation for oj to understand that.

A better answer than wait on a fix in OJ might be to reimplement your code so that either (1) any non-standard objects are cast to something that is standard before calling out to Oj, or (2) a check is made on serialize that there are no non-standard objects in the data and an exception raised otherwise — this would force the implementer of any particular endpoint to explicitly do the casting for themselves.

fdr commented 7 years ago

It's not really undefined: with use_to_json set, it should call that symbol.

puts Sequel::Postgres::JSONBHash.new({hello: 'world'}).to_json
{"hello":"world"}
brandur commented 7 years ago

Ah, cool. Well you are still somewhat reliant in that case of JSONBHash properly implementing #to_json or a monkey patch to provide your own implementation. As long as you know what you are doing though :)

fdr commented 7 years ago

Well, it delegates to_hash. As it turns out, it was neither of these things, but a difference in what compat means entirely that MultiJSON doesn't understand. https://github.com/ohler55/oj/issues/446 https://github.com/intridea/multi_json/issues/181

joshwlewis commented 6 years ago

Just a note for future readers:

My team has been seeing similar issues when upgrading to oj version 3 or greater. oj's :compat mode (which is the mode multi_json uses for oj) was changed substantially in version 3, where it prefers #to_json rather than #to_hash to serialize objects. We ended up solving it by configuring multi_json to use a :custom mode for oj that prefers #to_hash like this:

MultiJson.dump_options = {
  mode: :custom,
  use_to_hash: true,
  use_to_json: false
}