infochimps-labs / ironfan

Chef orchestration layer -- your system diagram come to life. Provision EC2, OpenStack or Vagrant without changes to cookbooks or configuration
http://infochimps.com
Other
502 stars 102 forks source link

Make realms a top-level value for silverware announce/discover #266

Closed mrflip closed 11 years ago

mrflip commented 11 years ago

(this came out of an internal discussion... @temujin9 feels as follows, I have concerns. That is why you're about to see me file, then object, to this issue)

Cross-cluster discovery is complicated. Silverware already has the concept of discovery realms, but their usage needs to be cleaned up and expanded.

Currently, silverware uses node[:discovers][sys][subsys] for stating which realm it discovers in, and node[sys][subsys][:announce_as] for the realm in announces in. The former requires weird conditionals to work right with optional sys and subsys, the latter is straightforward to scope. Switch both to use the latter format (renaming the former as :discover_as), and make sys also optional (to allow node[:discover_as] and node[:announce_as] statements).

Add a deprecation for the node[:discovers] hash, and ensure all clusters are migrated to use node[sys][subsys][:discover_as].

mrflip commented 11 years ago

The one thing I feel very strongly about is the spelling :discover_as. If we're going to make the change, I would strongly prefer :discover_in => :zk; changing :announce_as to :announce_in. The as makes it sound like it belongs to the self, the discover-er -- the meaning I read is "discover me as this thing" (ie an annoucer) and not "discover this thing over there".

Otherwise, this is a borderline call, so I'm not going to kick up a big fuss if you make the alteration, except to a) point out that this requires touching every cluster definition out there and ask if the effort+risk of doing so is worth it (there, I asked; assuming the answer is "yup" you don't need to explain further); and b) to spell out the reasons why I originally did things the way they are (below).

So if you'll let me have :discover_in (or anything but _as), then read, shake your head, and close.


In the cluster definition, you may override the "realm" in which something is discovered -- so that the kafka brokers on a machine in the 'kfk' cluster can discover the zookeeper in the zk cluster.

The proposed interface does it at the bottom -- discovery is attached to the component, not to the server:

cluster :kfk do 
  override_attributes({
    :zookeeper => { 
      :discover_in => :zk
      # ... stuff about being a zookeeper client ...
    },
    :hadoop => {
      :discover_in => :hdp
      # ... stuff about being a hadoop client ...
    },
    :kfk => {
      :announce_in => :global_log_archiver,
      # ... stuff about being a kafka server ...
    }
  })
end

By contrast, currently this is done at the top:

cluster :kfk do 
  override_attributes({
    :discovers => { :zookeeper => :zk, :hadoop => :hdp },
    #
    :kfk => {
      :announce_in => :global_log_archiver,
      # ... stuff about being a kafka server ...
    }
    :zookeeper => { ... }, # stuff about being a zookeeper client
    :hadoop    => { ... }, # stuff about being a hadoop client

  })
end

Doing it at the top means that in one glance, I can tell all the places my cluster connects to. Doing it the bottom means that when I'm looking at information about eg kafka's settings, I'm also looking at the realm it discovers. I can see the appeal of each but prefer the former.

However, I believe that discovery is a concern of the cluster, and not of its components. If I were to define a helper method that was "set up everything about being a zookeeper client" and move it into a separate file to be used by all the cluster definitions, I could not put this code in that. It could only live in the cluster definition file. That says to me connections among clusters are a property of the cluster/facet, not of the component -- which to me means it's the _announcein override that should be moved to the top, not both to the bottom.


Finally, a minor nit -- in https://github.com/infochimps-labs/ironfan-pantry/blob/2b76bbd4027632e6332ee950e010bc6fe7655cd4/cookbooks/silverware/libraries/discovery.rb#L85 -- there's a helper for doing the "subsys value if it exists, els the sys value if it exists" dance that this should use.

temujin9 commented 11 years ago

So, the main reason for putting it the other way around is that I can't easily say "discover everything in this realm" (which was the motivation for this change) while preserving the semantics you'd prefer. node[:discovers_in][:default] or node[:discovery_realm] both feel like warts. Moving both to the bottom feels (and looks) semantically easier.

I see your point about the separation of duties. Let me noodle it a bit more, and see if I can find a semantic that:

Also: Where's that helper? The code I was writing was trending toward that thing anyway.

temujin9 commented 11 years ago

Okay, better solution: these are attributes of silverware. They should be scoped as such, but then conform (beneath that), to the semantics I was suggesting:

node[:silverware][sys][subsys][:discover] = 'foo'
node[:silverware][sys][subsys][:announce] = 'bar'
node[:silverware][sys][:discover] = 'baz'
node[:silverware][sys][:announce] = 'qux'
node[:silverware][:discover] = 'quux'
node[:silverware][:announce] = 'corge'

Still a bit more verbose than I like, but both well scoped and easy to manipulate. I'm deprecating but not removing the old functionality. We'll drop it at the next major release of silverware (v4.0.0), after the 3.3.0 branch has disseminated widely enough that the deprecated calls are likely fixed.

temujin9 commented 11 years ago

Ah, bugger. That collides with some existing attributes in undesirable ways. Amended:

node[:realm][sys][subsys][:discover] = 'foo'
node[:realm][sys][subsys][:announce] = 'bar'
node[:realm][sys][:discover] = 'baz'
node[:realm][sys][:announce] = 'qux'
node[:realm][:discover] = 'quux'
node[:realm][:announce] = 'corge'

I doubt we'll want a separate cookbook/service named realm, which would be the main concern there.

temujin9 commented 11 years ago

After discussion, @mrflip and I seem to agree that the original proposal, with some minor modifications, is the best one. Standardizing on:

node[sys][subsys][:discover_in] = 'foo'
node[sys][subsys][:announce_in] = 'bar'
node[sys][:discover_in] = 'baz'
node[sys][:announce_in] = 'qux'
node[:discover_in] = 'quux'
node[:announce_in] = 'corge'
temujin9 commented 11 years ago

This has been worked into silverware; the old ways are deprecated, and will be removed in silverware 4.0.