basho-labs / puppet-riak

A puppet module to deploy Riak clusters
Apache License 2.0
33 stars 37 forks source link

how about forming a cluster after installation #41

Open gavinHuang opened 9 years ago

gavinHuang commented 9 years ago

this module is for install riak, do you guys have any plan to add more features? such as forming a cluster by executing cluster joining command.

haf commented 9 years ago

Puppet 4.0 might add some coordination between nodes, making it possible to join clusters as a part of the puppet run. Perhaps @hlindberg knows about a time plan for that?

hlindberg commented 9 years ago

Puppet 4.0 will be released sooner than expected and with a smaller scope (sept/oct 2014), so no new fancy orchestration beyond what can already easily be done with exported resources and PuppetDB.

haf commented 9 years ago

@gavinHuang So it seems there will be some more wait for clusting in puppet natively. Some modules go the route of always saying to the software to cluster; but I'm not sure how riak would take if we tell the first node to cluster without any other nodes present? Is there a 'wait for response' flag on the clustering command that can block the puppet resource? (/cc @jsmartin )

danieldreier commented 9 years ago

@haf I don't think there's any barrier to using puppetdb to do this sort of clustering if you use puppet 4 / future parser DSL. I like using puppetdbquery for this sort of thing. For example, you can do something like:

$cluster_members = query_nodes("Class[Riak]", 'ipaddress')

if $cluster_members.count > 1 {
  # install riak and attempt to bootstrap a cluster
  $join_cmd = "riak-admin cluster join riak@${cluster_members[0]}"
  # we should create a type/provider wrapping riak-admin to manage cluster membership instead of using an exec here
} else {
  # install riak but do not configure a cluster
}

The query_nodes creates an array of the ipaddress fact values from all nodes which have compiled catalogs that contain the "riak" class. An arbitrary cluster_member (the first one) is selected as a target to try and join. If we actually had logic in a provider around riak-admin we would pass it the IP rather than constructing a riak-admin command as a string.

In reality we'd want a more complex query because in a real environment this would provide no way to have multiple riak clusters; different environments, stages, etc would get mixed together. I'm just trying to illustrate the general principle.

@haf does that seem to meet the requirements you see?

haf commented 9 years ago

It's something that could work, yes!

mbbroberg commented 9 years ago

Moving to Ready based on discussion - PR is welcome if it'll handle the need!

danieldreier commented 9 years ago

we might be better off documenting how to implement this using the roles and profiles pattern rather than necessarily implementing it in the module itself, because if we make puppet 4 / future parser DSL a hard dependency for this module it'll be unusable for a large proportion of current puppet users.

mbbroberg commented 9 years ago

+1. PR as you see fit @danieldreier and we'll get it in so @gavinHuang can run it as his discretion.

danieldreier commented 9 years ago

I'm holding off on a PR to fix this until we've got Riak 2 coverage; I've never run anything on riak 1 and don't think it's worth investing time in that - but don't want to write docs that I can't test.

Speaking of docs - @laurenrother might be interested in this since she's about to move from Puppet Labs to Basho.

edit: another thing this is blocking on is cross-node testing using beaker. I'm hesitant to make strong proposals for automating clustering when I can't test it.

danieldreier commented 9 years ago

@mjbrender is there a way to interact with riak-admin in a more automation-friendly way? I can't figure out how to get JSON or other structured output out of it, and I can't find HTTP API endpoints for the same functionality. I'd hate to have to regex-parse the output from riak-admin.

danieldreier commented 9 years ago

PR https://github.com/basho-labs/puppet-riak/issues/53 describes a proposal for a type/provider to facilitate automated clustering. It's not the whole thing, just a building block, but since people in this issue have expressed interest I'd value feedback on that approach.

I'm planning on starting to write code for that this upcoming weekend, so if I don't hear anything I'll kind of have to work with my best guesses and just try to iterate until it's not terrible. I don't actually have any experience running riak in production so input from a more experienced user on what pain points to address would be pretty valuable.