jeremyevans / sequel

Sequel: The Database Toolkit for Ruby
http://sequel.jeremyevans.net
Other
4.99k stars 1.07k forks source link

Sequel's atom-based column naming conflicts with serializers #396

Closed Cloven closed 12 years ago

Cloven commented 12 years ago

JSON, MessagePack, and other popular serialization/deserialization packages do not work well with atoms. Only the lower-performant and space-consuming Marshal serializer is able to 'bring back' column names as atoms from its serialization format.

Given the prevalence of JSON and other systems, and the requirement to stash a model (e.g. in a cache) and reconstitute it in a form that can be used, I propose a configuration option in which column names are strings rather than atoms so that interoperability can be achieved without headaches.

Note: the obvious (at least to me) solution of attempting to re-atomize the column names after deserialization runs into difficulties when the data structure is arbitrarily sized and arbitrarily deep.

jeremyevans commented 12 years ago

I assume by "atom" you mean symbol (new to ruby?). Sequel ships with JSON and XML serializer plugins that correctly handle the column symbols (at an arbitrary depth), so I would recommend you use those. For MessagePack and other serialization packages, you will probably have to write your own serializer, which shouldn't be too difficult if you follow the json_serializer and xml_serializer plugin designs.

If the serializers you are using can't handle symbols, they can't be used generically with ruby objects. Are you proposing that no ruby objects use symbols, simply because your serializers can't handle them correctly?

Sequel assumes columns are returned as symbols all over the codebase. The returning of columns as symbols has been with Sequel since the beginning and will never change, nor will Sequel ever provide an option to return columns as strings, and I'm not one to use "never" lightly.

Note that it's trivial to return columns as strings if you want to do so yourself. For a plain dataset, just add a row_proc:

dataset.row_proc = proc{|r| h={}; r.each{|k, v| h[k.to_s] = v}; h}

For models, overriding #[]= and .call should be sufficient:

class Sequel::Model
  def []=(c, v) super(c.to_s, v) end
  def self.call(r) h={}; r.each{|k, v| h[k.to_s] = v}; super(h) end
end
Cloven commented 12 years ago

Fair enough -- it does make sense that someone finally write a decent serializer, and I understand the desire to stick with symbols. Thank you for the interesting and useful-looking code fragments, those might be a good solution to the underlying problem.