Closed hconnan closed 5 months ago
@hconnan Chart release v0.9.2 allows using -notifier.blackhole
correctly. Could you check if this release helps in your case?
Hey! Thanks for the quick response!
Could you please release a new version for [victoria-metrics-k8s-stack](https://github.com/VictoriaMetrics/helm-charts/tree/master/charts/victoria-metrics-k8s-stack)
please? I need an updated version for it in order to check if it helps in my case.
EDIT Ok I just saw a new version has been released for victoria-metrics-k8s-stack. Let me check
It does not work. I got this error : failed to init: failed to init notifier: only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified
I saw you add a fix but it seems that, somewhere, the vmalert.alertmanager.urls
is set and it's empty by default. With your fix, I still the notifer.url=
parameter. I am not sure the existing condition is enough.
@hconnan Could you please share values file which reproduces this error for you? (with any sensitive information removed)
Sure.
alertmanager:
enabled: false
coreDns:
enabled: false
defaultDashboardsEnabled: false
defaultRules:
create: false
rules:
alertmanager: false
etcd: false
general: false
k8s: false
kubeApiserver: false
kubeApiserverAvailability: false
kubeApiserverBurnrate: false
kubeApiserverHistogram: false
kubeApiserverSlos: false
kubePrometheusGeneral: false
kubePrometheusNodeRecording: false
kubeScheduler: false
kubeStateMetrics: false
kubelet: false
kubernetesApps: false
kubernetesResources: false
kubernetesStorage: false
kubernetesSystem: false
network: false
node: false
vmagent: false
vmcluster: false
vmhealth: false
vmsingle: false
fullnameOverride: vm-cluster
grafana:
enabled: false
kube-state-metrics:
enabled: true
nameOverride: kube-state-metrics-staging
namespaces: monitoring-staging
rbac:
useClusterRole: true
replicas: 1
kubeApiServer:
enabled: false
kubeControllerManager:
enabled: false
kubeEtcd:
enabled: false
kubeProxy:
enabled: false
kubeScheduler:
enabled: false
kubelet:
enabled: false
prometheus-node-exporter:
enabled: false
serviceAccount:
annotations:
iam.gke.io/gcp-service-account: xxx
create: true
name: vm-cluster
victoria-metrics-operator:
enabled: false
vmagent:
enabled: true
spec:
externalLabels:
cluster: xxx
entity: xxx
environment: staging
extraArgs:
enableTCP6: "true"
promscrape.suppressScrapeErrorsDelay: 120s
ignoreNamespaceSelectors: false
image:
tag: v1.99.0
inlineRelabelConfig:
- source_labels:
- service
target_label: gce_instance
nodeSelector:
cloud.google.com/gke-nodepool: xxx
replicaCount: 2
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 50m
memory: 200Mi
scrapeInterval: 30s
selectAllByDefault: true
serviceScrapeNamespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring-staging
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: monitoring
vmalert:
enabled: true
ingress:
enabled: true
hosts:
- vmalert-staging.victoria-metrics
ingressClassName: traefik
tls:
- secretName: tls-certs
spec:
extraArgs:
notifier.blackhole: "true"
image:
tag: v1.99.0
inlineRelabelConfig:
- source_labels:
- service
target_label: gce_instance
logFormat: json
logLevel: INFO
nodeSelector:
cloud.google.com/gke-nodepool: xxx
replicaCount: 2
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 50m
memory: 100Mi
selectAllByDefault: true
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: monitoring
vmcluster:
enabled: true
ingress:
insert:
enabled: true
hosts:
- vminsert-staging.victoria-metrics
ingressClassName: traefik
tls:
- secretName: tls-certs
select:
enabled: true
hosts:
- vmselect-staging.victoria-metrics
ingressClassName: traefik
tls:
- secretName: tls-certs
storage:
enabled: false
spec:
replicationFactor: 2
retentionPeriod: "1"
serviceAccountName: vm-cluster
vminsert:
extraArgs:
maxLabelsPerTimeseries: "10000000"
image:
tag: v1.99.0-cluster
nodeSelector:
cloud.google.com/gke-nodepool:xxx
replicaCount: 2
resources:
limits:
cpu: "1.5"
memory: 1000Mi
requests:
cpu: "1"
memory: 500Mi
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: monitoring
vmselect:
extraArgs:
search.maxSeries: "1000000"
search.maxUniqueTimeseries: "0"
image:
tag: v1.99.0-cluster
nodeSelector:
cloud.google.com/gke-nodepool: xxx
replicaCount: 2
resources:
limits:
cpu: "1"
memory: 1000Mi
requests:
cpu: "0.5"
memory: 500Mi
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: monitoring
vmstorage:
containers:
- command:
- /bin/sh
- -c
- |
sleep 40
while true; do
# every hour we create a snapshot and upload it to latest
/vmbackup-prod \
-storageDataPath=/vm-data \
-snapshot.createURL=http://localhost:8482/snapshot/create \
-dst=gs://xxx/vmstorage-snapshots/latest/monitoring-staging-$POD_NAME
# if its 5am we also upload the daily snapshot
if [ $(date +%H) -eq "05" ]; then
/vmbackup-prod \
-storageDataPath=/vm-data \
-snapshot.createURL=http://localhost:8482/snapshot/create \
-dst=gs://xxx/vmstorage-snapshots/daily-$(date +%d-%m-%Y)/monitoring-staging-$POD_NAME
fi
sleep 1h
done
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
image: victoriametrics/vmbackup:v1.99.0
name: hourly-sidecar-backup
volumeMounts:
- mountPath: /vm-data
name: vmstorage-db
extraArgs:
search.maxUniqueTimeseries: "0"
image:
tag: v1.99.0-cluster
nodeSelector:
cloud.google.com/gke-nodepool: xxx
replicaCount: 3
resources:
limits:
cpu: "4"
memory: 12000Mi
requests:
cpu: "4"
memory: 12000Mi
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 1000Gi
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: monitoring
vmsingle:
enabled: false
So there is no alertmanager.urls set somewhere and you can see there is notifier.blackhole
set to true
@hconnan The chart I've referred to in this comment was actuall victoria-metrics-alert
, not victoria-metrics-k8s-stack
.
Let me also check k8s-stack
chart and apply similar fix there.
All is good for me! Great job! Thank you very much 😄 🥳
Hello,
'notifier.url' parameter is not longer needed by default in Victoria Metrics.
When we would like to disable, we need to set the
notifier.blackhole
parameter to true. Since Victoria Metrics 1.96, when this parameter is set, we cannot setnotifier.url
parameter in the same time (source) However, in the server-deployment template of the victoria-metric-alerts chart, thenotifier-url
is always set whatever the extra arguments.How to reproduce it? Deploy VM alert with
notifier.blackhole
extraArgs set totrue
.Fix suggestion: I suggest to add a condition to set the
notifier.url
parameter when there is no-notifier.blackhole
or-notifier.config
extra arguments.