propershark / shark

An event publisher for realtime transit information.
3 stars 0 forks source link

Object embedding should be its own middleware #11

Closed faultyserver closed 8 years ago

faultyserver commented 8 years ago

While it makes sense to embed objects in Conductor (as it is the reason that embeds need to exist), it makes more sense to me that embedding takes place in a separate middleware, that would go before Conductor in the stack.

My current reasoning is this:

elliottwilliams commented 8 years ago

Agreed. I was reading through Conductor yesterday because I need to check which events triggered associations, and I think it'd fit in better as its own middleware. It'd certainly make it easier to ascertain how associations are made from the code.

faultyserver commented 8 years ago

After writing a bit of the implementation, I think it makes more sense to place this new middleware after Conductor in the stack. That way, any events that Conductor fires will also go through the embedding process, making the events that get published more consistent.

So now, the stack is

Transport
ObjectEmbed
Conductor
Agency (the core system)
faultyserver commented 8 years ago

Having gotten most of the way through the implementation for this, I'm thinking have a more-generic Serialization middleware is more fitting than one dedicated to "embedding objects", which doesn't actually do anything until an object gets serialized to a hash anyway (an Object instance is a plain Object instance until it gets serialized. The embedding we're trying to accomplish doesn't change anything about the Object instance, it just replaces values in the serialized Hash).

So, my proposal is to change ObjectEmbed to Serializer, and have each Shark::Object subclass that wants to embed information implement serialize_for(event, opts={}). It seems like object embeds are mostly event-agnostic, but having the event parameter will enable potential specifications later on.

I like the idea of having the serialization be something that an Object does itself. The to_h implementation started this idea from the very beginning, where ObjectManager#fire was providing object.to_h as the first argument to an event, the rationale being "it has to happen sometime; why not now?". This has since been changed on the object_embed branch to provide Object instances as parameters right up until events get published by Transport. Having the logic internal to an Object also makes it easier for other middlewares to effect changes in the serialization of an Object (by setting flags or other means). I can't think of any uses for this, but it could happen.

On the other hand, I don't like the idea of having serialization handled by Object classes because it's a bundling of view logic into a model. Not that it's necessarily terrible, but the two are rather unrelated. The only actual advantage I can currently see to having it in the Object classes is better obeying the Lawguideline of Demeter, since almost all of the necessary attributes will be directly accessible. The advantages to having serialization logic managed externally include unbundling/isolating the serialization logic, and better configurability (with Shark::Configurable); while underwhelming, these are certainly nice to have.

tl;dr: ObjectEmbed will become Serializer, which will map all of the args/kwargs for an event with Object#serialize_for(event_type) or Serializer#serialize(arg, for: event_type) (decision pending on which). The change is occurring for better semantics in that embedding requires serialization to actually have any effect.

elliottwilliams commented 8 years ago

That makes sense to me. Since all serialized views from Shark represent a single object, it doesn't feel like you are bundling in any logic that isn't already directly related to the object's properties.

faultyserver commented 8 years ago

My qualm with having it as part of Shark::Object was that how exactly an object gets serialized (which attributes are included, whether associations are followed, etc.) is dependent on its context.

For example, a Route object as a top-level argument will be serialized with all of its attributes and its associated objects. However, when it is embedded inside a Vehicle object, it only includes a few attributes (identifier, name, short_name, color) and does not include its associated objects.

This context dependency seems to mean that serialize_for will either have to take a context parameter of some sort (or an option like simplified: true) or objects that embed other object's attributes will have to do that embedding manually, which is somewhat brittle.

I'm still a little torn on where the implementation should go, but I'm going to try writing it as part of Serializer first, and see how that goes. If it feels messy/awkward, I'll move it into Shark::Object.

faultyserver commented 8 years ago

Alright. I'm not entirely convinced that having serialization inside of Shark::Object is best, but it is definitely the easier solution, so I'll be implementing that instead.

Serializer will still provide configuration options for serialization through Serializer.configure, but these will simply be passed on to the instance's to_json method to deal with as it pleases.

The advantage to this approach is simplicity: All native Ruby objects (numerics, strings, arrays, etc.) have a default to_json implementation, and Shark::Object#to_json already takes care most Objects; all that has to be written is specializations for partial embedding routes on vehicles, stations on routes, and other such embeds.

faultyserver commented 8 years ago

Once again, I'm having different thoughts now.

to_json doesn't actually work desirably, here: it returns the JSON string representation of an object rather than the hash. I had confused this with as_json, which would actually do what I want here (convert Objects to hashes but leave simple types as-is).

However, native objects do not implement as_json (nor do they implement to_h), meaning that method can not be consistently used, and we're back to square one with a large case statement to do type checking.

ruby_wamp_client (the WAMP gem I'm using) doesn't like taking string arguments (it expects args/kwargs as an array/hash, and calls to_json on them).


With that, I'm debating whether Serializer is really necessary at all.

While it provides opt-in functionality and a nice, isolated configuration scheme, it really doesn't do much, since the WAMP client expects plain objects and calls to_json on them anyway.

I think I'm going to try adding Shark::Configurable to Shark::Object and having a default configuration that looks something like:

Shark::Object.configure do |config|
  ...
  ### Serialization ###

  # Set to true to embed all objects in a one-to-many association. For example:
  # when true, a Route will embed object information for each Station in its
  # `stations` attribute; when false, the attribute would be kept as a list of
  # identifiers.
  config.embed_has_many_associations = true
  # The number of levels through which object embedding will occur.
  # A value of 0 will perform no embedding on Objects.
  # A value of 1 will embed Objects that are direct attributes of an Object.
  # Values higher than 1 will embed further-nested Object attributes.
  config.embed_depth = 1
end

For special cases, the configuration can be overwritten both at the class level, and the instance level:

# Class-level specialization
Shark::Route.configure do |config|
  config.embed_has_many_associations = false
end

# Instance-level specialization
vehicle = Shark::Vehicle.new
vehicle.configure do |config|
  config.embed_depth = 3
end

These options will all get parsed by the to_json method on Shark::Object by default (or by any subclass implementation) when the WAMP client calls it on each argument for an event.

faultyserver commented 8 years ago

Closed by #18.