Open ganilmca opened 3 years ago
This issue should IMHO be moved to the "security" repo because its originating from the security plugin.
I can not reproduce this issue.
What I did:
1) Create a docker compose file with three opensearch nodes (see below)
2) docker compose up
3) Wait until cluster is ready and curl https://localhost:9200/_cluster/health?pretty -k -u admin:admin
reports three nodes
4) Klll one node (by killing the container)
5) curl https://localhost:9200/_cluster/health?pretty -k -u admin:admin
reports two nodes (which is correct)
6) Start the killed container again, wait a few secs
7) curl https://localhost:9200/_cluster/health?pretty -k -u admin:admin
reports three nodes (which is correct)
8) No exceptions in the logs of any node
docker-compose.yml
services:
opensearch-node1:
image: opensearchproject/opensearch:latest
container_name: opensearch-node1
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node1
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_master_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data
ports:
- 9200:9200
networks:
- opensearch-net
opensearch-node2:
image: opensearchproject/opensearch:latest
container_name: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_master_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch-net
opensearch-node3:
image: opensearchproject/opensearch:latest
container_name: opensearch-node3
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node3
- discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
- cluster.initial_master_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data3:/usr/share/opensearch/data
networks:
- opensearch-net
volumes:
opensearch-data1:
opensearch-data2:
opensearch-data3:
networks:
opensearch-net:
Hi Team,
We have deployed the cluster in VM's(3 node cluster) not in docker. So please help us how to resolve this issue in VM's based cluster.
Please let us know if any info required from our end.
Thank you.
I would like to take up this issue.
Hi @ganilmca I tried to reproduce this issue by deploying the cluster with 3 nodes and did not get the error you mention. Could you clarify more steps or details about how to reproduce it?
Hi @xuezhou25
We have tried to redeploy the OpenSearch Elasticsearch cluster in another vm's , but we got the same issues, after restart the host we couldn't able to get cluster health in same host. We were getting the below same error.
{ "error" : { "root_cause" : [ { "type" : "security_exception", "reason" : "Unexpected exception cluster:monitor/health" } ], "type" : "security_exception", "reason" : "Unexpected exception cluster:monitor/health" }, "status" : 500 }
Can you please confirm one thing, was you deploy the cluster in VM's or with Docker image.
Thanks you.
Can you please confirm one thing, was you deploy the cluster in VM's or with Docker image.
Thanks you.
Sure I deployed the cluster on a VM(ubuntu). Did tarball installation and modified opensearch.yml
.
Do you mean deploy 3 nodes on 3 VMs(with same port and different IP address)?
@xuezhou25
Thanks for your confirmation, but we couldn't able to get cluster health in all 3 hosts, we can able to get health in only one host. We have redeployed in another ip's , port=9200 even though we got same issues like "security exception". We have edit the opensearch.yml file like below.
cluster.name: opensearch-elasticsearch node.name: ${HOSTNAME} path.data: /var/lib/scylla/elastic/opensearch/data path.logs: /var/lib/scylla/elastic/opensearch/logs network.host: x.x.x.x http.port: 9200 transport.tcp.port: 9300 discovery.seed_hosts: ["x.x.x.x:9300","x.x.x.x:9300","x.x.x.x:9300"] cluster.initial_master_nodes: ["x.x.x.x","x.x.x.x","x.x.x.x"]
Please have a look and share your opensearch.yml file, we will try with your yml file.
Please help us to get success of this.
Thank you,
My opensearch.yml file:
Node: 1
node.name: node-1
network.host: 192.168.0.3
discovery.seed_hosts: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
cluster.initial_master_nodes: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
Node: 2
node.name: node-2
network.host: 192.168.0.10
discovery.seed_hosts: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
cluster.initial_master_nodes: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
Node: 3
node.name: node-3
network.host: 192.168.0.11
discovery.seed_hosts: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
cluster.initial_master_nodes: ["192.168.0.3", "192.168.0.10", "192.168.0.11"]
Others are set as default value.
Hi Team,
We have done the opendsearch installation in 3 nodes. At the first time we could able to get cluster health details in all 3 nodes like below.
[elastic@es2 opensearch]$ curl -k -u admin:admin -XGET https://es2:9200/_cluster/health?pretty { "cluster_name" : "opensearch-elasticsearch", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "discovered_master" : true, "active_primary_shards" : 1, "active_shards" : 3, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 } [elastic@es2 opensearch]$ date Fri Jul 16 17:11:09 IST 2021 [elastic@es opensearch]$
After restarted the es1 host then we got below error.
[elastic@es1 opensearch]$ curl -k -u admin:admin -XGET https://es1:9200/_cluster/health?pretty { "error" : { "root_cause" : [ { "type" : "security_exception", "reason" : "Unexpected exception cluster:monitor/health" } ], "type" : "security_exception", "reason" : "Unexpected exception cluster:monitor/health" }, "status" : 500 } [elastic@es1 opensearch]$ date Fri Jul 16 17:13:26 IST 2021 [elastic@es1 opensearch]$
We got the below error in logs.
[2021-07-16T17:13:17,436][ERROR][o.o.s.f.SecurityFilter ] [es1] Unexpected exception java.lang.ExceptionInInitializerError java.lang.ExceptionInInitializerError: null
If es2 host restarted then we have similar issue, same issue for es3 host too.
If all 3 hosts restarted then we could able to got cluster health only in 1 node(we con't say exactly purticular node). We couldn't able get cluster health in all 3 nodes until unless removed every thing in all 3 nodes then we have to freshly installation.
We have followed the below steps to install:
cluster.name: opensearch-elasticsearch node.name: ${HOSTNAME} path.data: /var/lib/scylla/elastic/opensearch/data path.logs: /var/lib/scylla/elastic/opensearch/logs network.host: x.x.x.x http.port: 9200 transport.tcp.port: 9300 discovery.seed_hosts: ["x.x.x.x:9300","x.x.x.x:9300","x.x.x.x:9300"] cluster.initial_master_nodes: ["x.x.x.x","x.x.x.x","x.x.x.x"]
Same config for other 2 nodes too except network.host
Please help to come out from this issue.
Thank you.