att-comdev / openstack-helm

PROJECT HAS MOVED TO OPENSTACK
https://github.com/openstack/openstack-helm
69 stars 41 forks source link

plan: discuss rabbit future state/plans #246

Open v1k0d3n opened 7 years ago

v1k0d3n commented 7 years ago

Kubernetes Version (output of kubectl version): N/A Helm Client and Tiller Versions (output of helm version): N/A Development or Deployment Environment?: N/A Release Tag or Master: N/A Expected Behavior: N/A What Actually Happened: N/A How to Reproduce the Issue (as minimally as possible): N/A Any Additional Comments:

Background: I think we need to discuss how to appropriately handle dependencies again. to this point, any deployment waiting in an init state was caught on either an init-container, or a job within the single service-level chart. we now have changes within our rabbit chart that will wait in an init state because of an undeclared dependency outside of the rabbit chart (etcd; another chart). if and operator deploys rabbit, it will be blocked until the separate etcd chart is deployed. the operator must know in advance that rabbit requires etcd, and this is fundamentally different from our other charts. it went undocumented as well, so users really need to figure this out on their own (as it stands today). this challenges how we're handling dependencies for the project. we'll also need to address endpoints similarly to our other charts (as pete works through these).

i'm concerned with carrying over this mixed message into openstack proper without a formally documented roadmap to RabbitMQ 3.7.0. we should appropriately document the edge cases as part of the PR, along with our plans to bring them back into the fold with our vision [dependency handing].

also, one other smaller consideration. if we're pulling large portions of source from upstream projects we need to ensure that those sources are dedicated to maintaining the code (dockerfiles/images, etc). for example, what would happen if this code/image were suddenly dropped by the maintainer? we don't want to own extra debt in our repo, but strike a balance with the "what if" scenarios of potential changes to upstream. we can handle this with documenting our sources (and also keep track of licensing; which could potentially be more concerning).

cc: @intlabs @alanmeadows @ss7pro

intlabs commented 7 years ago

In addition to the points @v1k0d3n raises above, this raises issues that I can see, and we should discuss at our next community meeting:

ss7pro commented 7 years ago

Really good analysis. Evaluating clustering backends for autocluster plugin we (Intel) have implemented native support for k8s (https://github.com/aweber/rabbitmq-autocluster/blob/master/src/autocluster_k8s.erl) but we withdrawn our support for that as Mirantis did full investigation of clustering methods. Result of their research is reflected in multiple improvements to etcd clustering backend, where the most important part is to avoid race condition on startup (leader election with lock) and split brain avoidance.

I would simply suggest to add rabbitmq with autocluster plugin to kolla so we will have this solution in well recognizable place, but IMHO there's no better rabbitmq solution for clustering on top of k8s at the moment. The one developed by Mirantis runs on multiple Intel production cluster without any issues (although we still use stackanetes there).

On #292 issue we discover that later when deploying openstack-helm in our clusters, but simply run out of time to upstream fix for that.

I don't really understand concept of proper attribution, what do you mean by that ? README file with credits to Mirantis repo ?