Open basigabri opened 4 years ago
Relates https://github.com/elastic/cloud-on-k8s/issues/2161.
It's fairly easy for users to create their own services targeting any Pods they want using label selectors. We could automatically create X services, but it's hard to know in advance what subset of Pods are users interested in. Hence we pre-create only the default one, for an easy quickstart experience.
You can easily setup something like this:
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.6.0
nodeSets:
- name: ingest-nodes
count: 3
config:
node.ingest: true
node.data: false
node.master. false
- name: master-nodes
count: 3
config:
node.ingest: false
node.data: false
node.master. true
- name: data-nodes
count: 3
config:
node.ingest: false
node.data: true
node.master. false
---
apiVersion: v1
kind: Service
metadata:
name: es-ingest-nodes
spec:
selector:
common.k8s.elastic.co/type: elasticsearch
elasticsearch.k8s.elastic.co/cluster-name: quickstart
elasticsearch.k8s.elastic.co/node-ingest: true
ports:
- name: https
protocol: TCP
port: 9200
targetPort: 9200
@basigabri do you think that's a good alternative?
Thank you very much @sebgl for your quick response.
I am already doing this. But this is not a managed service from ECK, but a service managed by me. The request would be for ECK to be able to manage services by Node type.
Additionally, I tried to configure kibana to hit this custom ingest-nodes service without success.
{"type":"log","@timestamp":"2020-02-17T14:30:19Z","tags":["error","elasticsearch","data"],"pid":7,"message":"Request error, retrying\nGET https://elasticsearch-ingest-nodes:9200/_xpack => Client network socket disconnected before secure TLS connection was established"} {"type":"log","@timestamp":"2020-02-17T14:30:19Z","tags":["warning","legacy-plugins"],"pid":7,"path":"/usr/share/kibana/src/legacy/core_plugins/visualizations","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/visualizations"} {"type":"log","@timestamp":"2020-02-17T14:30:20Z","tags":["warning","plugins","licensing"],"pid":7,"message":"License information could not be obtained from Elasticsearch for the [data] cluster. Error: Request Timeout after 30000ms"} {"type":"log","@timestamp":"2020-02-17T14:30:20Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"Unable to revive connection: https://elasticsearch-ingest-nodes:9200/"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["info","plugins-system"],"pid":7,"message":"Starting [8] plugins: [security,licensing,code,timelion,features,spaces,translations,data]"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"Unable to revive connection: https://elasticsearch-ingest-nodes:9200/"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"No living connections"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["warning","plugins","licensing"],"pid":7,"message":"License information could not be obtained from Elasticsearch for the [data] cluster. Error: No Living connections"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["error","elasticsearch","admin"],"pid":7,"message":"Request error, retrying\nGET https://elasticsearch-ingest-nodes:9200/.kibana_task_manager => self signed certificate in certificate chain"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["error","elasticsearch","admin"],"pid":7,"message":"Request error, retrying\nGET https://elasticsearch-ingest-nodes:9200/.kibana => self signed certificate in certificate chain"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["warning","elasticsearch","admin"],"pid":7,"message":"Unable to revive connection: https://elasticsearch-ingest-nodes:9200/"}
Request error, retrying\nGET https://elasticsearch-ingest-nodes:9200/.kibana => self signed certificate in certificate chain"
Something looks wrong in the way the certificate is setup in the Kibana configuration. Can you share how you specified this configuration?
A more general comment: ingest nodes are mostly useful to pre-process documents before their ingestion happens. It does not make much sense to route Kibana traffic to ingest nodes. Kibana is not ingesting much data and I guess you are note pre-processing Kibana data in Elasticsearch through your own ingest pipeline?
Something looks wrong in the way the certificate is setup in the Kibana configuration. Can you share how you specified this configuration?
---
# Source: kibana/templates/kibana.yaml
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: kibana
namespace: elastic
labels:
app.kubernetes.io/name: "release-name-kibana"
app.kubernetes.io/managed-by: "Tiller"
app.kubernetes.io/release: "release-name"
helm.sh/chart: "kibana-7.5"
spec:
version: 7.5.0
count: 1
config:
elasticsearch.hosts:
- https://elasticsearch-ingest-nodes:9200
elasticsearch.username: elastic
#elasticsearch.ssl.certificateAuthorities: /mnt/usr/ca.crt
secureSettings:
- secretName: elastic-kibana-kibana-user
podTemplate:
spec:
containers:
- name: kibana
volumeMounts:
- name: elasticsearch-certs
mountPath: /mnt/usr
readOnly: true
volumes:
- name: elasticsearch-certs
secret:
secretName: elasticsearch-es-http-certs-internal
http:
service:
spec:
type: ClusterIP
tls:
selfSignedCertificate:
subjectAltNames:
- dns: kibana.mydns.net
podTemplate:
spec:
containers:
- name: kibana
resources:
limits:
memory: 1Gi
The secret volume is not mounted in the path mountPath: /mnt/usr
. It seems like the CA is not consumed correctly.
`10:53 $ k describe pod kibana-kb-7656485d54-5tdtt
Name: kibana-kb-7656485d54-5tdtt
Namespace: elastic
Priority: 0
Node: aks-agentpool-93582542-vmss000000/10.240.0.4
Start Time: Mon, 17 Feb 2020 19:08:05 +0200
Labels: common.k8s.elastic.co/type=kibana
kibana.k8s.elastic.co/config-checksum=e3ac7c53717f3c34a526e1c961af3d3a3f1c3715422cc0a326b11240
kibana.k8s.elastic.co/name=kibana
kibana.k8s.elastic.co/version=7.5.0
pod-template-hash=7656485d54
Annotations: <none>
Status: Running
IP: 10.244.0.121
IPs: <none>
Controlled By: ReplicaSet/kibana-kb-7656485d54
Containers:
kibana:
Container ID: docker://f1c278635efae5bc6854863c1b7a146c37bd2294245085111aa46faa62dbe4fa
Image: docker.elastic.co/kibana/kibana:7.5.0
Image ID: docker-pullable://docker.elastic.co/kibana/kibana@sha256:0dfe7c796a7702556cd7e9bb7e2d56be335ec22260ce569038b3aaf663afa90b
Port: 5601/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 17 Feb 2020 19:08:07 +0200
Ready: False
Restart Count: 0
Limits:
memory: 1Gi
Requests:
memory: 1Gi
Readiness: http-get https://:5601/login delay=10s timeout=5s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/mnt/elastic-internal/http-certs from elastic-internal-http-certificates (ro)
/usr/share/kibana/config from config (ro)
/usr/share/kibana/data from kibana-data (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kibana-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
config:
Type: Secret (a volume populated by a Secret)
SecretName: kibana-kb-config
Optional: false
elastic-internal-http-certificates:
Type: Secret (a volume populated by a Secret)
SecretName: kibana-kb-http-certs-internal
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 4m53s (x5638 over 15h) kubelet, aks-agentpool-93582542-vmss000000 Readiness probe failed: HTTP probe failed with statuscode: 503
A more general comment: ingest nodes are mostly useful to pre-process documents before their ingestion happens. It does not make much sense to route Kibana traffic to ingest nodes. Kibana is not ingesting much data and I guess you are note pre-processing Kibana data in Elasticsearch through your own ingest pipeline?
There was a confusion, I want Kibana to hit only what in the past was called Client nodes (in the past), which now are the Coordinating Nodes ...
Does the error persist if #elasticsearch.ssl.certificateAuthorities: /mnt/usr/ca.crt
is uncommented out?
Yes, this is a copy paste error. I don't have it commented out. It's like this,
elasticsearch.ssl.certificateAuthorities: /mnt/usr/ca.crt
same error though, as pasted above.
"pid":7,"message":"Request error, retrying\nGET https://elasticsearch-ingest-nodes:9200/.kibana => self signed certificate in certificate chain"} {"type":"log","@timestamp":"2020-02-17T14:30:21Z","tags":["warning","elasticsearch","admin"],"pid":7,"message":"Unable to revive connection: https://elasticsearch-ingest-nodes:9200/"}
The main purpose of this Request though, is for the oporerator to be able to manage services by NodeType.
Thanks a lot !
@basigabri as @sebgl mentioned it's pretty similar to existing issue https://github.com/elastic/cloud-on-k8s/issues/2161 (which is to at least document how to do this, if not manage the services in the operator). I think it makes sense to close this ticket and use that other issue to discuss extra services. We can continue trying to troubleshoot your specific configuration here though if you'd like?
There should really be an easy and reliable way to control where the data is sent. Having the default go to all nodes including master nodes seems like it's setting less attentive users up for failure.
Creating an additional service to go to ingest/client/data nodes is relatively simple.
I see the issue being it's difficult to use the ECK Kibana/APM server.
Unless I'm missing something you need create your own password secret that implements a key with elasticsearch.password
because the existing kibana secret has a different key.
And without elasticsearchRef
I would assume one of the config changes (to certs / users etc.) triggers that would call a rolling restart on the apm server or kibana would never get triggered.
I would propose that you be able to keep elasticsearchRef and set config.elasticsearch.hosts
. As of right now it would merge your host with the other and you'd have 2 hosts set. If this simply didn't merge everything would work fine. ECK would fully manage your APM server / Kibana except for you overwriting where it points.
Update: My above proposal works on APM server but not on Kibana... not sure if it's a bug or just intended behavior.
I wonder if we could try to be smart and use the knowledge ECK has about the current topology of the cluster to create two different services:
Creating any other dedicated services (I cannot think of a good use case for those right now) would be the responsibility of the user.
Currently ECK creates only one service inside k8s, exposed at 9200 port and having a ClusterIP.
This service named {cluster_name}-es-http, includes as endpoints all type of nodes(master,data,ingest).
The other services created one by node type(ingest,data,master) are not exposed: they don't have ClusterIP.
Kibana as an example is using exposed svc, for querying elastic. This means that it hits all endpoints including masters. We need to create services exposed with ClusterIp, by node type. For example a svc only for ingest nodes. Then hit only these pods.