RailsEventStore / rails_event_store

A Ruby implementation of an Event Store based on Active Record
http://railseventstore.org
MIT License
1.42k stars 121 forks source link

Reading a JSON-serialized event from database breaks #367

Closed valscion closed 5 years ago

valscion commented 6 years ago

Hi there!

I noticed that if I tried to read a persisted JSON serialized event, I got an error from the RubyEventStore::Metadata#[]= method:

https://github.com/RailsEventStore/rails_event_store/blob/cb71504fc4dce24d642017f9b59a9930938e3190/ruby_event_store/lib/ruby_event_store/metadata.rb#L24

The reason seems to be that with this config (as is the document use case):

Rails.configuration.event_store = RailsEventStore::Client.new(
  mapper: RubyEventStore::Mappers::Default.new(
    serializer: JSON
  )
)

...the Default mapper ends up trying to set metadata to JSON.load(record.metadata) here:

https://github.com/RailsEventStore/rails_event_store/blob/cb71504fc4dce24d642017f9b59a9930938e3190/ruby_event_store/lib/ruby_event_store/mappers/default.rb#L20-L27

...and the output of JSON.load has keys as String instead of Symbol.

I fixed this locally by writing a wrapper serializer for JSON that symbolizes the keys when loading a persisted event:

module JSONSymbolizingSerializer
  def self.dump(hash)
    JSON.dump(hash)
  end

  def self.load(string)
    # RubyEventStore breaks if deserialized data does not have symbols as keys.
    JSON.parse(string).symbolize_keys
  end
end

Rails.configuration.event_store = RailsEventStore::Client.new(
  mapper: RubyEventStore::Mappers::Default.new(
    serializer: JSONSymbolizingSerializer
  )
)

Should we update the documentation to show this usage example for JSON, or is this something that could be fixed inside the gem itself?

mostlyobvious commented 6 years ago

Thanks for reporting this issue. Indeed, JSON.load returns Hash with string keys.

Should we update the documentation to show this usage example for JSON

That won't hurt to be more specific in the docs with the code and its behavior right now. I wonder if there are any drawbacks...

or is this something that could be fixed inside the gem itself?

I have to contemplate on it a bit more.

I fixed this locally by writing a wrapper serializer for JSON that symbolizes the keys when loading a persisted event

Since the serializer works both for metadata and data, it means you'll symbolize data as well, right? And that is not symbolizing nested keys if you do have them.

valscion commented 6 years ago

Since the serializer works both for metadata and data, it means you'll symbolize data as well, right? And that is not symbolizing nested keys if you do have them.

Oh true. We had this kludge in place, and the workaround isn't yet merged so I should fix it, too.

class DiscussionThreadStarted < ::RailsEventStore::Event
  def initialize(params)
    # KLUDGE: Make sure that as we're getting data back from JSON, the keys can still be
    # read as symbols (which is the input format)
    super(params.deep_symbolize_keys)
  end

  # ...rest of the event implementation
end

So yeah, we'd probably need to do a deep symbolization if we'd want to keep the old behavior we had with this kludge in place.


So maybe JSON should have it's own mapper instead of using RubyEventStore::Mappers::Default? That mapper could symbolize only the metadata keys and leave data keys as they were. That would be the minimally invasive solution for this error we're seeing.

😕 I'm gonna have to think on this some time more.

paneq commented 6 years ago

I absolutely agree this is a big issue. I haven't responded so far because I was not sure what's the good way to tackle this issue.

My best guess is that RES should symbolize_keys on the metadata returned from all mappers.

mostlyobvious commented 5 years ago

@valscion #489 has landed in master and will get released soon

valscion commented 5 years ago

Thanks! I'll see if it makes our wrapper code redundant.

joelvh commented 5 years ago

@pawelpacana @valscion @paneq I just implemented JSON serialization in the ROM adapter and there's another factor, which is date/time deserialization. YAML does this automatically, while JSON remains a string, so Event#timestamp (metadata :timestamp) value is a string instead of a date/time.

This can be deserialized further by overriding the Event constructor, which would check for :timestamp or any other metadata (and event data, for that matter) to further deserialize things.

What are your thoughts on approaching this aspect of JSON in terms of deserialization needs specific to schema/data typing of date/time?

I'm becoming now focused more on solving for pipelines (#400) to deal with this issue and also likely need for some encryption features on my current project (#451). And with all of this, of course we want to share experiences with GDPR (#372).

P.S. I think it would be extremely helpful to catalog examples for various solutions, whether encryption, GDPR, serializers, etc. Is there a "home" for example solutions/code in the docs or somewhere that can be made official to keep adding to?

mostlyobvious commented 5 years ago

Caring for correct timestamp deserialization is in scope of RES. I'm not sure whether we should care for other data/metadata not set by us -- I'd rather encourage users to use event schemas in their own code. By encouraging I mean providing documentation and examples.

Maybe timestamp should be part of SerializedRecord -- outside of metadata string on that layer. Mappers would care to put it where it is now on event object. Repositories can take different strategies how to persist it. In database schema we even have created_at that is used by ActiveRecord and could repurposed to store this timestamp instead.

mostlyobvious commented 5 years ago

P.S. I think it would be extremely helpful to catalog examples for various solutions, whether encryption, GDPR, serializers, etc. Is there a "home" for example solutions/code in the docs or somewhere that can be made official to keep adding to?

@mpraglowski had the idea of contrib/ in repository for 3rd party customizations, where the rules for mutation testing might be lowered and we don't recognize this as an official release (yet)

Tutorials section in docs could be fine for this. Or maybe new one (Ideas?). It is easier to categorize once the content is there, feel free to add and reorganize later.

joelvh commented 5 years ago

@pawelpacana I was actually thinking about a "contrib" for RES the other day as well. I like that idea a lot, and it could prove to be a very good toolkit to enhance RES without polluting it, and also better showcase what RES can enable applications to do. (Arguably, many of the Rails-based features I might even consider moving to contrib -- I know you guys switched from RubyEventStore to RailsEventStore for more visibility in the community, but I'm sure RES living in Rails has been a driver for @mpraglowski to suggest contrib because of the easy scope creep 😄 ).

Maybe @mpraglowski can create an issue for "contrib" and we can put a checklist in there of some of the ideas? Might be the way to track GDPR, encryption, additional serializers, etc in a more central place.