Open nevesing opened 6 months ago
@nevesing According to the log you provided, your OpenSearch instance answered HTTP/1.1 503
. That means Service Unavailable. So, how come that's related to TLS or our configuration? And it also tells, that TLS configuration works just fine, since it's able to retrieve the response.
So, please investigate why your OpenSearch instance is in unhealthy state, this is nothing to do with ReportPortal.
@HardNorth - I dont think HTTP 503 is relevant here since the curl cmd i mentioned above works fine which means opensearch is healthy. Error log also has CERTIFICATE_VERIFY_FAILED
so it does relevant to TLS. Could you please help?
@nevesing After CERTIFICATE_VERIFY_FAILED
there is error description: unable to get local issuer certificate (_ssl.c:1007)
. Does your certificates issued by public issuer?
@HardNorth No its internal to our org only. Again the curl was successful with the same set of CA, cert and key. Also volume mounts were fine and I am able to get into the container and cat
those files without permission issues.
Below commands were executed from inside the metrics-gatherer
container:
uwsgi@eabc486bc0c5:/backend/tls$ ls -al
total 12
drwxr-xr-x 2 root root 50 Apr 17 15:04 .
drwxr-xr-x 1 root root 17 Apr 17 15:04 ..
-rw-r--r-- 1 root root 3974 Apr 16 16:29 ca.crt
-rw-r--r-- 1 root root 2115 Apr 16 16:29 tls.crt
-rw-r--r-- 1 root root 1705 Apr 16 16:29 tls.key
uwsgi@eabc486bc0c5:/backend/tls$ curl https://xxxx-external-opensearch:9200/_cluster/health --cacert ca.crt --cert tls.crt --key tls.key -u xxx
Enter host password for user 'xxx':
{"cluster_name":"PRD_Cluster","status":"green","timed_out":false,"number_of_nodes":5,"number_of_data_nodes":5,"discovered_master":true,"discovered_cluster_manager":true,"active_primary_shards":34,"active_shards":131,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}
uwsgi@eabc486bc0c5:/backend/tls$ printenv | grep ^ES_
ES_HOST=https://xxxx-external-opensearch:9200
ES_CLIENT_CERT=/backend/tls/tls.crt
ES_USE_SSL=true
ES_TURN_OFF_SSL_VERIFICATION=false
ES_VERIFY_CERTS=true
ES_CLIENT_KEY=/backend/tls/tls.key
ES_PASSWORD=xxxx
ES_SSL_SHOW_WARN=true
ES_CA_CERT=/backend/tls/ca.crt
ES_USER=xxx
Shall we reopen the issue?
I also tried adding REQUESTS_CA_BUNDLE
envvar to the list and now the error message is different. I feel like the es_client.py is not using the TLS parameters at all. Could you please test if https actually works?
reportportal-metrics-gatherer | 2024-04-22 13:41:18,678 - ERROR - metricsGatherer.es_client - HTTPSConnectionPool(host='xxxx-external-opensearch', port=9200): Max retries exceeded with url: /_cluster/health (Caused by SSLError(SSLError(1, '[SSL: SSLV3_ALERT_BAD_CERTIFICATE] sslv3 alert bad certificate (_ssl.c:2578)')))
reportportal-metrics-gatherer | 2024-04-22 13:41:18,678 - ERROR - metricsGatherer.es_client - Elasticsearch is not healthy
I also tried a simple hello py program for ES client which works fine with my cert combination:
from elasticsearch import Elasticsearch
ELASTIC_PASSWORD = "xxxxxxxx"
client = Elasticsearch(
"https://xxxx-external-opensearch:9200",
ca_certs="ca.crt",
client_cert="tls.crt",
client_key="tls.key",
basic_auth=("xxx", ELASTIC_PASSWORD)
)
client.info()
@nevesing This all looks suspiciously for me, why then your Analyzer works fine? Or you don't tell something?
@nevesing This all looks suspiciously for me, why then your Analyzer works fine? Or you don't tell something?
@HardNorth - analyzer, analyzer-train, metrics-gatherer
all 3 containers have same issue
@HardNorth were you able to troubleshoot?
@HardNorth could you please help?
Describe the bug I am trying to use external opensearch instance with TLS enabled. However the envvar as described in docs is not working as expected
Steps to Reproduce Steps to reproduce the behavior:
Enable these envvar for the container
service-metrics-gatherer:5.11.0
Expected behavior Successful connection to opensearch
Actual behavior
Artifact version 5.11.0
Additional info
Verified the cert with curl cmd and it works fine