neo4j-contrib / neo4j-helm

Helm Charts for running Neo4j on Kubernetes [DEPRECATED]
https://neo4j-contrib.github.io/neo4j-helm/user-guide/USER-GUIDE.html
Apache License 2.0
88 stars 81 forks source link

improve service that is used to expose the cluster for consumption by clients #83

Closed eastlondoner closed 4 years ago

eastlondoner commented 4 years ago

Add com.neo4j.<something> labels so we can get hold of it easily using the Kubernetes API & label selectors - this is the biggie.

Does it need publishNotReadyAddresses: true any more? might be better to remove it if not needed.

moxious commented 4 years ago

The services already have appropriate labels that get generated like this:

  labels:
    app.kubernetes.io/managed-by: {{ .Release.Service | quote }}
    app.kubernetes.io/instance: {{ .Release.Name | quote }}
    helm.sh/chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    app.kubernetes.io/name: {{ template "neo4j.name" . }}
    app.kubernetes.io/component: core

This lets you know for sure that:

What else would we need to express not captured by that? The actual labels chosen are helm conventions, which explains why none of it is com.neo4j.whatever.

RE: do we need publishNotReadyAddresses? My understanding is yes we definitely need them. To form a cluster, it has to be able to contact the service. If the service doesn't form until the container is ready, then you have an unsatisfiable catch-22 because causal cluster can't come available until it discovers the other cluster members. If this is not the case, then please expand

eastlondoner commented 4 years ago

We have "discovery" services I am going to call them that because that's the name they're given in helm charts, although they do all internal services and we have what I am going to call "bolt and http" Services.

The labels I would like are something that allows me to distinguish between the two when writing a label selector. The Labels you point out logically apply to both.

The "bolt and http" services are the ones that I don't think need to be enabled when pods aren't ready. I am fully aware of the need for that on the "discovery" services.

moxious commented 4 years ago

OK I think I better see what you mean. What would better help discriminate?

The labels I pasted above are those on the user-facing bolt/https service. The ones on the internal discovery service are as follows:

metadata:
  name: discovery-{{ template "neo4j.fullname" $dot }}-{{ . }}
  labels:
    neo4j.com/coreindex: "{{ $index }}"
    neo4j.com/cluster: {{ template "neo4j.fullname" $dot }}
    neo4j.com/role: CORE
    app.kubernetes.io/managed-by: {{ $dot.Release.Service | quote }}
    app.kubernetes.io/instance: {{ $dot.Release.Name | quote }}
    helm.sh/chart: "{{ $dot.Chart.Name }}-{{ $dot.Chart.Version }}"
    app.kubernetes.io/name: {{ template "neo4j.name" $dot }}
    app.kubernetes.io/component: core

As the discovery service has a neo4j.com/coreindex (which is already targeted by the label selector for discovery) - there's already a way to discriminate between the internal & external services. But perhaps I don't get the use case right.

Which label on which service gets you what you're requesting?

eastlondoner commented 4 years ago

Specifying the bolt/http service by exclusion of other services that the helm chart has is, I think, not ideal. It obfuscates the intention of the selector, it's not necessarily future proof and requires me to look up the syntax for specifying that something should not have a particular label.

I think we could improve by allowing users to specify explicitly that they want: Something like neo4j.com/cypher/bolt=true neo4j.com/browser/http=true

moxious commented 4 years ago

@eastlondoner how about just boolean flags for services exposed, excluding port information:

neo4j.com/exposes/$SVC: (true|false)

Where $SVC in (bolt, https, http, discovery, transaction, raft, prometheus, jmx, graphite)

n.b. this won't help you if someone has configured a non-standard bolt port

eastlondoner commented 4 years ago

Sound perfect.

The only thing I wonder about is that http from 4.2 onward could be 2 different things: http (browser) and http (cypher).

Lets also pretend that at some point we (Optionally) move the internal checks to their own port.

Sent from my iPhone

On 31 Aug 2020, at 20:11, moxious notifications@github.com wrote:

 @eastlondoner how about just boolean flags for services exposed, excluding port information:

neo4j.com/exposes/$SVC: (true|false) Where $SVC in (bolt, https, http, discovery, transaction, raft, prometheus, jmx, graphite)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

moxious commented 4 years ago

HTTP (browser) or HTTP (cypher) ... who cares? Both are HTTP. To come further, one needs a concrete scenario of what you'd be trying to do with label selectors. I'm a bit concerned with this ticket because we don't yet have an expression of what one would try to select and why that would be important. That kind of story would in turn drive which selectors would seem most natural to enable that.

Neo4j is going to change going forward no matter what we pick. And not knowing what gets into which release means that without the story of the use case we're trying to select for, it'll be near-impossible to pick something that's adaptable for the future - but on the other hand, if we put new label selectors in, people will come to expect that they stay there indefinitely, as it's a lure to get people to write new label selector code, which can subsequently get broken by changing the labeling strategy.

I'll keep this open but right now I'm feeling like I don't know enough to choose the right labeling approach. Sure, I could just do boolean yes/no this service has that affordance. That's not hard to do but neither is annotating things after the install either. Specific needs should guide the solution

eastlondoner commented 4 years ago

Ok I don’t want to get deep on this either. But this came from me trying to do something and finding it awkward and unclear. I was trying to write code that could connect to the Kubernetes api and list available neo4j bolt services. Which is something will want neo4j desktop to do in the near future.

Could we label the services that are supposed to be used for bolt and leave the rest for the future?

Sent from my iPhone

On 31 Aug 2020, at 20:33, moxious notifications@github.com wrote:

 HTTP (browser) or HTTP (cypher) ... who cares? Both are HTTP. To come further, one needs a concrete scenario of what you'd be trying to do with label selectors. I'm a bit concerned with this ticket because we don't yet have an expression of what one would try to select and why that would be important. That kind of story would in turn drive which selectors would seem most natural to enable that.

Neo4j is going to change going forward no matter what we pick. And not knowing what gets into which release means that without the story of the use case we're trying to select for, it'll be near-impossible to pick something that's adaptable for the future - but on the other hand, if we put new label selectors in, people will come to expect that they stay there indefinitely, as it's a lure to get people to write new label selector code, which can subsequently get broken by changing the labeling strategy.

I'll keep this open but right now I'm feeling like I don't know enough to choose the right labeling approach. Sure, I could just do boolean yes/no this service has that affordance. That's not hard to do but neither is annotating things after the install either. Specific needs should guide the solution

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

moxious commented 4 years ago

@eastlondoner final note on this for now...I was able to remove publishNotReadyAddresses for the core set but not for the replica set. The core set definitely doesn't need it, you're right -- because they discover one another via the discovery services, etc.

RRs are a different matter. They discover the core as usual, but due to akka pickiness in 4.0, their advertised address (which is the bolt/http endpoint service) has to match how the resolve. Which it can't, unless the service is published before the pod is ready. So for readreplica-dns.yaml, publishNotReadyAddresses stays.