rom-rb / rom

Data mapping and persistence toolkit for Ruby
https://rom-rb.org
MIT License
2.09k stars 161 forks source link

Serialising/de-serialising with the new ROM::Struct format #403

Closed mrship closed 7 years ago

mrship commented 7 years ago

With the move in 3.2.3 to a revised ROM::Struct we're now seeing issues when serialising to JSON via Oj. We do this to store the result of a DB query in redis as our DB connection is very slow.

In 3.2.2 and below we can dump the ROM::Struct into JSON and read it back out quite happily, with 3.2.3 we can't. Now, I realise that Oj is nothing to do with ROM, but the recent change to Structs may have unintended side effects that our use case has highlighted?

Anyway, here's an example (that uses rom-sql)

#!/usr/bin/env ruby
require 'bundler/inline'

gemfile do
  source 'https://rubygems.org'
  gem 'rom', '=3.2.3' #'=3.2.2'
  gem 'rom-sql'
  gem 'rom-repository'
  gem 'dry-types'
  gem 'sqlite3'
  gem 'oj'
end

require 'rom-sql'
require 'rom-repository'
require 'dry-types'

module Types
  include Dry::Types.module
end

rom = ROM.container(:sql, 'sqlite::memory') do |c|
  c.gateways[:default].create_table :employees do
    primary_key :id, Integer
    column :name, String
  end

  c.relation(:employees) do
    schema(:employees, infer: true) do
      attribute :id, Types::Int.meta(primary_key: true)
      attribute :name, Types::String
    end
  end
end

rom.relations[:employees].insert(name: "Jane")

class EmployeeRepo < ROM::Repository[:employees]
  def first
    employees.to_a.first
  end
end

employee = EmployeeRepo.new(rom).first

puts Oj.load(Oj.dump(employee))

# with 3.2.3
# => #<ROM::Struct::Employee:0x7fd5fd45b020>

# with 3.2.2
# => {"id"=>1, "name"=>"Jane"}
flash-gordon commented 7 years ago

Thanks for reporting this, that's probably because I didn't add respond_to_missing? 🙈 , I'll have a look.

flash-gordon commented 7 years ago

Aha, that's because ROM now has named structs, this shouldn't break anything but as we see it did. However, I'd say ROM works fine and this behavior won't change. Oj gets a class by its name but we generate classes dynamically and all of them share the same name. When Oj accesses ROM::Struct::Employee it gets the parent class for all other Employee classes. The class it gets doesn't have any attribute in its definition, that's why it silently skips all data that comes from Oj.load and returns you a useless empty struct. Since all auto-structs are anonymous I don't think it is possible to serialize/deserialize them in a nice manner. E.g. what if you loaded an aggregate

user = user_repo.aggregate(:posts).by_pk(id).one!

and then you serialized it with Oj (or whatever) and try to load in another process, which just doesn't have such a class yet (because no-one tried to build an aggregate like that)?

The only way to make sure you load data back is to use your own named classes instead of auto-structs.

If you're OK with dumping/loading hashes you can explicitly call .to_hash on structs before serializing them, this works as before. Another option is defining to_json and using Oj.to_json:

class ROM::Struct::User < ROM::Struct
  def to_json
    to_hash.to_json
  end
end

Oj.to_json(repo.users.first) # => "{\"id\":1,\"name\":\"Jane\"}"
Oj.load(Oj.to_json(repo.users.first, mode: :compat)) # => {"id"=>1, "name"=>"Jane"}

@solnic I wonder if you have any thoughts on this, or you just should know about the corner case we have here

solnic commented 7 years ago

Oj gets a class by its name

Then all it needs to do is stop doing that and use object.class instead, no?

flash-gordon commented 7 years ago

@solnic how so? It dumps things to a string, ofc it can dump a class to #<Class:0x007fe730250020> but this is pretty dumb because this works only in one process and in one process you do not need to serialize anything

solnic commented 7 years ago

I'm lost, how does it know which class to use for a particular json hash? This sounds like some kind of magic to me :(

flash-gordon commented 7 years ago

Ha, sorry about that :)

[3] pry(main)> Oj.dump(repo.users.first)
=> "{\"^o\":\"ROM::Struct::User\",\"id\":1,\"name\":\"Jane\"}"
flash-gordon commented 7 years ago

then it calls ROM::Struct::User.new(id :1, name: "Jane") and ROM::Struct::User.new swallows the args because the class doesn't have attributes

solnic commented 7 years ago

I see, thanks. Well it can't work reliably, previously it worked by an accident. Structs are dynamic classes, we don't guarantee unique names. Maybe we could consider generating unique class names and then it could work, but I honestly can't think of any sensible naming strategy :/

flash-gordon commented 7 years ago

@solnic keep in mind also that the class you want to deserialize to have to exist at the time you call Oj.load and we can't guarantee that, even with unique names

solnic commented 7 years ago

Yeah, it really boils down to the fact Oj doesn't have first-class support for anonymous classes. As I said, it worked by an accident before :)

mrship commented 7 years ago

Thanks for the detailed review. In line with your suggestion, we are OK with using hashes, we've changed the serialisation to use:

Oj.load(Oj.dump(employee, mode: :compat))

which gets us the raw attributes back out. We do then consume these in custom entity classes (based on Dry::Struct) so that works as a deserialisation approach.

solnic commented 7 years ago

In general I would recommend encapsulating json serialization/deserialization with your own APIs, instead of referring to 3rd party gems, ie Entities::User#to_json and Entities::User#from_json. This way you can at least easily tell that a given entity is used with JSON, and you also hide the details about how it's done exactly.

I think we can close it?

mrship commented 7 years ago

Yes, we can close this. Thanks again for the time and effort in reviewing.