Closed sanjay3290 closed 1 year ago
Can you provide the yaml for the service monitor you created?
Hello @HoustonPutman, Below is the Servicemonitor yaml , i used the default provided in SolrOperator Documentation.
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: solr-metrics labels: release: prometheus spec: selector: matchLabels: solr-prometheus-exporter: solr-dev-prom-exporter namespaceSelector: matchNames:
So you are using a serviceMonitor, and the Solr metrics service is listening on port 80, or at least it should be... The pod is listening on port 8080, but the service forwards that 80 -> 8080 when sending the request to the pod.
I have almost the exact same thing working correctly.
What version of the prometheus stack are you running? Also can you provide information on your Kube cluster? (version, vendor, etc) I have a feeling there's an issue with your networking.
you are right, thats how its supposed to work. However the service endpoint in prometheus targets is referencing to http://podIP:90/metrics and due to that, the connection is getting refused. My other default service endpoints for prometheus are working as expected.
Prometheus :
Chart:prometheus-15.16.1
Version:2.39.1
Kubernetes:
AWS EKS, Version:1.22
Are you sure you don't have a podMonitor defined as well?
Looks like there might be a bug in the prometheus operator? In the meantime you can use targetPort
instead to set 8080
. Here are the available options under endpoints
.
We have the same problem here. We are using the solr-operator 0.6 and prometheus 2.39.1 hosted in gke version 1.21. We aren't using the prometheus operator. I deployed the solr prometheus exporter with the following snippet:
apiVersion: solr.apache.org/v1beta1
kind: SolrPrometheusExporter
metadata:
name: solr-prom-exporter
spec:
customKubeOptions:
resources:
requests:
cpu: 300m
memory: 900Mi
solrReference:
basicAuthSecret: solr-cloud-k8s-oper-secret
cloud:
name: "apache-solr"
numThreads: 6
As you can see in the screenshot prometheus tries to connect to the pod on port 80 which is the wrong port.
Our workaround is to add a prometheus scraping annotation to the exporter pod:
spec:
customKubeOptions:
podOptions:
annotations:
prometheus.io/port: "8080"
prometheus.io/path: /metrics
prometheus.io/scrape: "true"
prometheus.io/scheme: http
In that screenshot, is the 10.110.6.70
IP address the service ClusterIP or the pod IP? If it's the service's then there is something wrong with kubernetes. If its the pod, then Prometheus shouldn't be trying to contact the pod at all, it should be contacting the service IP...
We have the same problem here. We are using the solr-operator 0.6 and prometheus 2.39.1 hosted in gke version 1.21. We aren't using the prometheus operator. I deployed the solr prometheus exporter with the following snippet:
apiVersion: solr.apache.org/v1beta1 kind: SolrPrometheusExporter metadata: name: solr-prom-exporter spec: customKubeOptions: resources: requests: cpu: 300m memory: 900Mi solrReference: basicAuthSecret: solr-cloud-k8s-oper-secret cloud: name: "apache-solr" numThreads: 6
As you can see in the screenshot prometheus tries to connect to the pod on port 80 which is the wrong port.
Our workaround is to add a prometheus scraping annotation to the exporter pod:
spec: customKubeOptions: podOptions: annotations: prometheus.io/port: "8080" prometheus.io/path: /metrics prometheus.io/scrape: "true" prometheus.io/scheme: http
even after adding pod annotation, prometheus still looking at port 80 on pod IP in my case. Something is seriously wrong with this.below is my exporter config.
apiVersion: solr.apache.org/v1beta1
kind: SolrPrometheusExporter
metadata:
name: solr-prom-exporter
spec:
customKubeOptions:
podOptions:
annotations:
prometheus.io/port: "8080"
prometheus.io/path: /metrics
prometheus.io/scrape: "true"
prometheus.io/scheme: http
resources:
requests:
cpu: 300m
memory: 900Mi
solrReference:
cloud:
name: "eks"
numThreads: 6
In that screenshot, is the
10.110.6.70
IP address the service ClusterIP or the pod IP? If it's the service's then there is something wrong with kubernetes. If its the pod, then Prometheus shouldn't be trying to contact the pod at all, it should be contacting the service IP...
It is the pod ip
even after adding pod annotation, prometheus still looking at port 80 on pod IP in my case. Something is seriously wrong with this.below is my exporter config.
The old failed target will still exits but there should be a new target which should works.
Can you share your prometheus scraping config? This seems to be a prometheus issue...
We are having the same issue. The prometheus.io/port
annotation is set to port 80
, which doesn't correspond with the port of the pod. This causes Prometheus to fail to scrape the service endpoint.
We've also bypassed the problem by enabling scraping of the pods directly:
customKubeOptions:
podOptions:
annotations:
prometheus.io/port: "8080"
prometheus.io/path: /metrics
prometheus.io/scrape: "true"
prometheus.io/scheme: http
The Prometheus scraping config we use is the default kubernetes-service-endpoints
job from the default config.
Looking at the code, it looks like the prometheus.io/port
value is set from ExtSolrMetricsPort
, not SolrMetricsPort
which would have fixed the problem.
Any attempts to overwrite this by using custom serviceAnnotations
is not working, as custom annotations can only supplement the default ones, not overwrite them: https://github.com/apache/solr-operator/blob/main/controllers/util/prometheus_exporter_util.go#L400
We have exactly the same issue.
We are having the same issue. The
prometheus.io/port
annotation is set to port80
, which doesn't correspond with the port of the pod. This causes Prometheus to fail to scrape the service endpoint.We've also bypassed the problem by enabling scraping of the pods directly:
customKubeOptions: podOptions: annotations: prometheus.io/port: "8080" prometheus.io/path: /metrics prometheus.io/scrape: "true" prometheus.io/scheme: http
The Prometheus scraping config we use is the default
kubernetes-service-endpoints
job from the default config.
Indeed, this is a valid workaround
So it seems like everyone is using kubernetes-service-endpoints
, could you try using kubernetes-services
and see if the problem is fixed?
I think the issue is that this feature was designed with the kubernetes-services
usage in mind, however it looks like it should work with kubernetes-service-endpoints
as well, but breaks in this way. I don't think there's a way that we can get both to work at the same time, unless we remove the prometheus.io/port
annotation all-together.
I will try to test this locally but it might be difficult. I'm happy to create a test docker image for anyone else to try out (based on v0.6.0) and see if it fixes things for them.
Situation before Solr:
Situation after Solr: We installed the solr-exporter using
apiVersion: solr.apache.org/v1beta1
kind: SolrPrometheusExporter
metadata:
name: solr-prom-exporter
spec:
customKubeOptions:
podOptions:
resources:
requests:
cpu: 300m
memory: 900Mi
solrReference:
cloud:
name: "eks"
numThreads: 6
No metrics are scraped from Solr as, by default, it seems Prometheus is using the endpoints? Default Prometheus configuration:
global:
evaluation_interval: 1m
scrape_interval: 1m
scrape_timeout: 10s
remote_write:
- queue_config:
capacity: 2500
max_samples_per_send: 1000
max_shards: 200
sigv4:
region: east-us-1
url: https://aps-workspaces.east-us-1.amazonaws.com/workspaces/XXX/api/v1/remote_write
rule_files:
- /etc/config/recording_rules.yml
- /etc/config/alerting_rules.yml
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-apiservers
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: default;kubernetes;https
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_service_name
- __meta_kubernetes_endpoint_port_name
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-nodes
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- replacement: kubernetes.default.svc:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/$1/proxy/metrics
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-nodes-cadvisor
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- replacement: kubernetes.default.svc:443
target_label: __address__
- regex: (.+)
replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
source_labels:
- __meta_kubernetes_node_name
target_label: __metrics_path__
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
- honor_labels: true
job_name: kubernetes-service-endpoints
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scrape
- action: drop
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (.+?)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_service_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_service_name
target_label: service
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node
- honor_labels: true
job_name: kubernetes-service-endpoints-slow
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (.+?)(?::\d+)?;(\d+)
replacement: $1:$2
source_labels:
- __address__
- __meta_kubernetes_service_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_service_name
target_label: service
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node
scrape_interval: 5m
scrape_timeout: 30s
- honor_labels: true
job_name: prometheus-pushgateway
kubernetes_sd_configs:
- role: service
relabel_configs:
- action: keep
regex: pushgateway
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_probe
- honor_labels: true
job_name: kubernetes-services
kubernetes_sd_configs:
- role: service
metrics_path: /probe
params:
module:
- http_2xx
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_service_annotation_prometheus_io_probe
- source_labels:
- __address__
target_label: __param_target
- replacement: blackbox
target_label: __address__
- source_labels:
- __param_target
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- honor_labels: true
job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape
- action: drop
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
replacement: '[$2]:$1'
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: replace
regex: (\d+);((([0-9]+?)(\.|$)){4})
replacement: $2:$1
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: drop
regex: Pending|Succeeded|Failed|Completed
source_labels:
- __meta_kubernetes_pod_phase
- honor_labels: true
job_name: kubernetes-pods-slow
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
replacement: '[$2]:$1'
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: replace
regex: (\d+);((([0-9]+?)(\.|$)){4})
replacement: $2:$1
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: drop
regex: Pending|Succeeded|Failed|Completed
source_labels:
- __meta_kubernetes_pod_phase
scrape_interval: 5m
scrape_timeout: 30s
I have a patch that I think should work: https://github.com/apache/solr-operator/pull/539. Would someone be willing to try out this fix in their cluster?
Steps to try it:
make docker-build
, then upload to docker somewherekubectl delete service <name>-solr-metrics
If it does work we can get this into the v0.7.0
release that should be coming soon!
It seems to be working
Cool, I will go ahead and merge then!
I have followed the solr operator documentation to configure SolrPrometheusExporter, however after creating the servicemonitor, the service endpoint is going inactive. After further troubleshooting, i realized the metric server is trying to connect to port 80 whereas the metrics server is running on port 8080. Is it possible to pass port into service monitor?
Get "http://x.x.x.x:80/metrics": dial tcp x.x.x.x:80: connect: connection refused