seriohub / velero-helm

Helm charts for velero-ui, velero-api, and velero-watchdog
14 stars 6 forks source link

Velero-watchdog helm add prefix text to message send by email, telegram, slack. #10

Closed rchekhina closed 4 months ago

rchekhina commented 4 months ago

HI David,

on the helm values can you add a field to add a prefix text to the message send by email, telegram, slack ?

to change this message :

Velero Backup sa-portainer-20240510093041 Completed

to:

Production Velero Backup sa-portainer-20240510093041 Completed or: Staging Velero Backup sa-portainer-20240510093041 Completed

other thing, is it possible to send a message when a backup start ?

Best regards.

davideserio commented 4 months ago

Hi,

We have released a version v0.1.7 with some message improvements.

For now, as a workaround for receiving notifications when a backup starts, you can set notificationSkipInProgress to "false". The watchdog will send you a notification when it detects a new backup in progress. (workaround limit: if the backup duration is less than PROCESS_CYCLE_SEC, you may only receive messages for completed backups).

rchekhina commented 4 months ago

HI Davide,

Ok thanks for the information.

Best regards.

rchekhina commented 4 months ago

HI Davide,

I have updated velero-ui to version v0.1.7 but now the slack notifications doesnt work. On the UI when I try to Send test notification I have this log on the watchdog container:

INFO:     2024-05-14 10:25:09.035 [common.routers.health] send test channel notification email:False telegram:False slack:True 
INFO:     2024-05-14 10:25:09.035 [core.dispatcher] dispatcher run active
INFO    [Env] load_key.key=TELEGRAM_ENABLE value=False
INFO    [Env] load_key.key=EMAIL_ENABLE value=False
INFO    [Env] load_key.key=SLACK_ENABLE value=True
INFO:     2024-05-14 10:25:09.035 [core.dispatcher_slack] slack channel notification is active
INFO:     2024-05-14 10:25:09.035 [core.dispatcher_slack] send_slack
INFO:     2024-05-14 10:25:09.175 [core.dispatcher_slack] Message sent successfully to Slack channel infrastructure-web
INFO:     10.2.0.7:45050 - "GET /send-test-notification?email=False&telegram=False&slack=True HTTP/1.1" 200 OK

But nothing arrive on my slack channel.

Best regards.

davideserio commented 4 months ago

In the diagnostic information on the login page "Check watchdog" is marked with green check?

rchekhina commented 4 months ago

HI Davide,

yes its green:

image

Best regards.

davideserio commented 4 months ago

Hi, Strange, no changes have been made to slack. In the configmap VELERO-API-CONFIG, there is a key VELERO_WATCHDOG_URL with the default value vui-watchdog-clusterip. Can you verify this key? vui-watchdog-clusterip must be the name of the service connected to the watchdog pod. Next, from the VUI-API pod bash, can you try running

curl http://vui-watchdog-clusterip:8001/ 

and see if the alive message returns?

rchekhina commented 4 months ago

HI Davide,

the value of the key VELERO_WATCHDOG_URL is velero-ui-vui-watchdog-clusterip and the curl result from the api pod :

curl http://velero-ui-vui-watchdog-clusterip:8001/ 
{"status":"alive"}

did you try the slack notifications ?

Best regards.

davideserio commented 4 months ago

Yes, on the clusters I have all 3 notification channels active and they are functioning.

Versions of my running components:

Have you tried restarting the pods and rechecking the slack channel and slack token from configmap or UI?

rchekhina commented 4 months ago

HI Davide,

i have the same version. the token was good but I have created an another one to test it and same result. logs of watchdog:

INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO:     2024-05-15 08:18:35.228 [main] start
INFO:     2024-05-15 08:18:35.228 [main] load config
INFO:     2024-05-15 08:18:35.228 [main] run server at url:0.0.0.0-port=8001
INFO:     Will watch for changes in these directories: ['/app']
WARNING:  "workers" flag is ignored when reloading is enabled.
INFO:     Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
INFO:     Started reloader process [1] using StatReload
INFO:     Started server process [9]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO:     2024-05-15 08:18:35.556 [main] start
INFO:     2024-05-15 08:18:35.556 [main] load config
INFO    [Env] load_key.key=PROCESS_CYCLE_SEC value=300
INFO    [Env] load_key.key=NOTIFICATION_ALIVE_MSG_HOURS value=24
INFO    [Env] load_key.key=TELEGRAM_ENABLE value=False
INFO    [Env] load_key.key=TELEGRAM_CHAT_ID value=<telegram*********
INFO    [Env] load_key.key=TELEGRAM_TOKEN value=<telegra********
INFO    [Env] load_key.key=TELEGRAM_MAX_MSG_LEN value=3000
INFO    [Env] load_key.key=TELEGRAM_MAX_MSG_MINUTE value=20
INFO    [Env] load_key.key=EMAIL_ENABLE value=False
INFO    [Env] load_key.key=EMAIL_ACCOUNT value=<your-email>
INFO    [Env] load_key.key=EMAIL_PASSWORD value=<your-p*******
INFO    [Env] load_key.key=EMAIL_SMTP_PORT value=<smtp-port>
E=invalid literal for int() with base 10: '<smtp-port>', F=/app/config/config.py, L=139
INFO    [Env] load_key.key=EMAIL_SMTP_SERVER value=<smtp-server>
INFO    [Env] load_key.key=EMAIL_RECIPIENTS value=<email-recipents-comma-saparted>
INFO    [Env] load_key.key=SLACK_ENABLE value=True
INFO    [Env] load_key.key=SLACK_CHANNEL value=#infrastr*********
INFO    [Env] load_key.key=SLACK_TOKEN value=xapp-1-A06K6K11NJU-7112993529334-68abafe086fe6e2************************************************
INFO    [Dispatcher setup] telegram=False
INFO    [Dispatcher setup] email=False
INFO    [Dispatcher setup] Slack=True
INFO    [Env] load_key.key=BACKUP_ENABLE value=True
INFO    [Env] load_key.key=SCHEDULE_ENABLE value=True
INFO    [Env] load_key.key=EXPIRES_DAYS_WARNING value=29
INFO    [Env] load_key.key=PROCESS_CLUSTER_NAME value=cluster.local
INFO    [Env] load_key.key=K8S_INCLUSTER_MODE value=True
INFO    [Env] load_key.key=PROCESS_KUBE_CONFIG value=None
INFO    [Env] load_key.key=IGNORE_NM_1 value=None
INFO    [Process setup] k8s cluster name= cluster.local
INFO    [Process setup] k8s in cluster mode=True
INFO    [Process setup] k8s config file=None
INFO    [Process setup] velero backup enable=True
INFO    [Process setup] velero schedule enable=True
INFO    [Process setup] k8s send summary message=True
INFO    [Process setup] k8s ignored namespaces: regex defined 0
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO:     2024-05-15 08:18:38.026 [core.kubernetes_status_run] start main procedure seconds 300
INFO:     2024-05-15 08:18:38.026 [core.velero_checker] checker run
INFO:     2024-05-15 08:18:38.026 [core.dispatcher] dispatcher run active
INFO:     2024-05-15 08:18:38.026 [core.dispatcher_telegram] telegram channel notification is active
INFO:     2024-05-15 08:18:38.026 [core.dispatcher_email] email channel notification is active
INFO:     2024-05-15 08:18:38.026 [core.dispatcher_slack] slack channel notification is active
INFO:     2024-05-15 08:18:40.026 [core.kubernetes_status_run] self.cycle_seconds 300
INFO:     2024-05-15 08:18:40.133 [core.velero_status] _get_namespace_list all: 83 after filter: 83 ignored : 0
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] checker new element received
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __process_cluster_name__
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] cluster name cluster.local
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __process_schedule_report
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __process_schedule_report. do nothing same data
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __last_backup_report
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __last_backup_report. unscheduled namespaces status changed
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] __last_backup_report. unscheduled status changed. no old value set
INFO:     2024-05-15 08:18:40.142 [core.velero_checker] send_to_dispatcher. msg len= 2-unique False 
INFO:     2024-05-15 08:18:40.142 [core.dispatcher_slack] send_slack
INFO:     2024-05-15 08:18:40.332 [core.dispatcher_slack] Message sent successfully to Slack channel #infrastructure-web
INFO:     10.2.0.252:51524 - "GET /info HTTP/1.1" 200 OK
INFO:     10.2.0.252:51538 - "GET /info HTTP/1.1" 200 OK
INFO:     10.2.0.252:51554 - "GET / HTTP/1.1" 200 OK
INFO:     10.2.0.252:51560 - "GET /info HTTP/1.1" 200 OK
INFO:     10.2.0.252:41276 - "GET /info HTTP/1.1" 200 OK
INFO:     10.2.0.252:41292 - "GET /info HTTP/1.1" 200 OK
INFO    [Env] load_key.key=PROCESS_CYCLE_SEC value=300
INFO    [Env] load_key.key=NOTIFICATION_ALIVE_MSG_HOURS value=24
INFO    [Env] load_key.key=TELEGRAM_ENABLE value=False
INFO    [Env] load_key.key=TELEGRAM_CHAT_ID value=<telegram*********
INFO    [Env] load_key.key=TELEGRAM_TOKEN value=<telegra********
INFO    [Env] load_key.key=TELEGRAM_MAX_MSG_LEN value=3000
INFO    [Env] load_key.key=TELEGRAM_MAX_MSG_MINUTE value=20
INFO    [Env] load_key.key=EMAIL_ENABLE value=False
INFO    [Env] load_key.key=EMAIL_ACCOUNT value=<your-email>
INFO    [Env] load_key.key=EMAIL_PASSWORD value=<your-p*******
INFO    [Env] load_key.key=EMAIL_SMTP_PORT value=<smtp-port>
E=invalid literal for int() with base 10: '<smtp-port>', F=/app/config/config.py, L=139
INFO    [Env] load_key.key=EMAIL_SMTP_SERVER value=<smtp-server>
INFO    [Env] load_key.key=EMAIL_RECIPIENTS value=<email-recipents-comma-saparted>
INFO    [Env] load_key.key=SLACK_ENABLE value=True
INFO    [Env] load_key.key=SLACK_CHANNEL value=#infrastr*********
INFO    [Env] load_key.key=SLACK_TOKEN value=xapp-1-A06K6K11NJU-7112993529334-68abafe086fe6e2************************************************
INFO    [Dispatcher setup] telegram=False
INFO    [Dispatcher setup] email=False
INFO    [Dispatcher setup] Slack=True
INFO    [Env] load_key.key=BACKUP_ENABLE value=True
INFO    [Env] load_key.key=SCHEDULE_ENABLE value=True
INFO    [Env] load_key.key=EXPIRES_DAYS_WARNING value=29
INFO    [Env] load_key.key=PROCESS_CLUSTER_NAME value=cluster.local
INFO    [Env] load_key.key=K8S_INCLUSTER_MODE value=True
INFO    [Env] load_key.key=PROCESS_KUBE_CONFIG value=None
INFO    [Env] load_key.key=IGNORE_NM_1 value=None
INFO    [Process setup] k8s cluster name= cluster.local
INFO    [Process setup] k8s in cluster mode=True
INFO    [Process setup] k8s config file=None
INFO    [Process setup] velero backup enable=True
INFO    [Process setup] velero schedule enable=True
INFO    [Process setup] k8s send summary message=True
INFO    [Process setup] k8s ignored namespaces: regex defined 0
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO    [Env] load_key.key=DEBUG_LEVEL value=Info
INFO:     2024-05-15 08:19:21.193 [common.routers.health] send test channel notification email:False telegram:False slack:True 
INFO:     2024-05-15 08:19:21.193 [core.dispatcher] dispatcher run active
INFO    [Env] load_key.key=TELEGRAM_ENABLE value=False
INFO    [Env] load_key.key=EMAIL_ENABLE value=False
INFO    [Env] load_key.key=SLACK_ENABLE value=True
INFO:     2024-05-15 08:19:21.193 [core.dispatcher_slack] slack channel notification is active
INFO:     2024-05-15 08:19:21.193 [core.dispatcher_slack] send_slack
INFO:     2024-05-15 08:19:21.372 [core.dispatcher_slack] Message sent successfully to Slack channel #infrastructure-web
INFO:     10.2.0.252:39248 - "GET /send-test-notification?email=False&telegram=False&slack=True HTTP/1.1" 200 OK

Best regards.

rchekhina commented 4 months ago

HI Davide,

My watch-dog cm:

API_ENDPOINT_URL: 0.0.0.0
  BACKUP_ENABLE: "True"
  DEBUG_LEVEL: Info
  EMAIL_ACCOUNT: <your-email>
  EMAIL_ENABLE: "False"
  EMAIL_PASSWORD: <your-password>
  EMAIL_RECIPIENTS: <email-recipents-comma-saparted>
  EMAIL_SMTP_PORT: <smtp-port>
  EMAIL_SMTP_SERVER: <smtp-server>
  EXPIRES_DAYS_WARNING: "29"
  K8S_INCLUSTER_MODE: "True"
  NOTIFICATION_SKIP_COMPLETED: "False"
  NOTIFICATION_SKIP_INPROGRESS: "False"
  NOTIFICATION_SKIP_REMOVED: "True"
  PROCESS_CLUSTER_NAME: cluster.local
  PROCESS_CYCLE_SEC: "300"
  SCHEDULE_ENABLE: "True"
  SLACK_CHANNEL: "infrastructure-web"
  SLACK_ENABLE: "True"
  SLACK_TOKEN: xapp-1-xxxxxxxx-xxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  TELEGRAM_CHAT_ID: <telegram-chat-id>
  TELEGRAM_ENABLE: "False"
  TELEGRAM_TOKEN: <telegram-token>

Best regards.

davideserio commented 4 months ago

I replicated the configuration and it is working. The only difference is that my token starts with xoxb- (Bot token) instead of xapp- (App-level token). Also, you could try running the following commands from the Watchdog pod bash:

curl http://127.0.0.1:8001/send-test-notification?email=False&telegram=False&slack=True

and

curl -d "text=Test-from-bash-pod" -d "channel=<your-channel-id>" -H "Authorization: Bearer <your-token>" -X POST https://slack.com/api/chat.postMessage
rchekhina commented 4 months ago

HI Davide,

Ok I have test it with the app token and the result is: {"ok":false,"error":"not_allowed_token_type"} So you are right the token is not good so I use the bot token and the test message works but when I do a backup I dont receive the message of the start and complete job. I have those variables on the configmap:

NOTIFICATION_SKIP_COMPLETED: "false"
NOTIFICATION_SKIP_INPROGRESS: "false"
NOTIFICATION_SKIP_REMOVED: "True"

Best regards

rchekhina commented 4 months ago

HI Davide,

I only receive the report slack notification and on the report the namespaces of the schedules backup appear on the list of namespace without active backup.

Best regards

davideserio commented 4 months ago

Hi,

We are fixing several issues in the watchdog. I hope to release a new version asap.

rchekhina commented 4 months ago

Ok, thanks for the update.

On Fri, May 17, 2024 at 11:11 AM Davide @.***> wrote:

Hi,

We are fixing several issues in the watchdog. I hope to release a new version asap.

— Reply to this email directly, view it on GitHub https://github.com/seriohub/velero-helm/issues/10#issuecomment-2117101166, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASWWRPE5GCHHDZYBP7HKF3TZCXCU3AVCNFSM6AAAAABHQKDTNGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGEYDCMJWGY . You are receiving this because you authored the thread.Message ID: @.***>

-- Reynald CHEKHINA SA, Square1

*e: @. @.>* Company Registration Number: 528714 VAT number IE3178050PH

davideserio commented 4 months ago

Chart version 0.1.8 has been released. In addition to several improvements, the variables reportBackupItemPrefix and reportScheduleItemPrefix are now available for use as prefixes in messages.

rchekhina commented 4 months ago

HI Davide,

ok thanks I have installed it but I dont receive start and complete slack notifications. When I send the watchdog test message it works but on the summary none of backups, schedules and namespace backuped appear :

Namespaces:
    • total=83
    • unscheduled=83
Backups (based on last backup for every schedule and backup without schedule)
    • total=0
    • completed=0
Namespace without active backup (83/83):

Best regards

davideserio commented 4 months ago

In the UI backup and schedule are displayed correctly?

rchekhina commented 4 months ago

HI Davide,

Yes on the UI I can see backups schedules :

image

Myabe the problem is that velero is installed on the namespace sa-velero-ui and velero-ui on namespace sa-velero-ui.

Best regards.

davideserio commented 4 months ago

Hi, Can you check the API or watchdog pod for any errors?

rchekhina commented 4 months ago

HI Davide,

I dont see error on the api and watchdog logs:

api:

INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /schedule/get HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "POST /backup/create-from-schedule HTTP/1.1" 201 Created
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK
INFO:     10.2.2.71:40258 - "GET /stats/in-progress HTTP/1.1" 200 OK

watchdog:

INFO:     2024-05-24 06:45:34.252 [core.kubernetes_status_run] self.cycle_seconds 300
INFO:     2024-05-24 06:45:34.370 [core.velero_status] _get_namespace_list all: 83 after filter: 83 ignored : 0
INFO:     2024-05-24 06:45:34.423 [core.velero_checker] checker new element received
INFO:     2024-05-24 06:45:34.423 [core.velero_checker] __process_cluster_name__
INFO:     2024-05-24 06:45:34.423 [core.velero_checker] cluster name k8s-stg1-global-001
INFO:     2024-05-24 06:45:34.424 [core.velero_checker] __process_schedule_difference_report
INFO:     2024-05-24 06:45:34.424 [core.velero_checker] __process_schedule_difference_report. do nothing same data
INFO:     2024-05-24 06:45:34.424 [core.velero_checker] __process_backups_difference_report
INFO:     2024-05-24 06:45:34.424 [core.velero_checker] __process_backups_difference_report. do nothing same data

Best regards.

davideserio commented 4 months ago

HI Davide,

Yes on the UI I can see backups schedules :

image

Myabe the problem is that velero is installed on the namespace sa-velero-ui and velero-ui on namespace sa-velero-ui.

Best regards.

Okay, that might be the problem. I'll check it out.

davideserio commented 4 months ago

Hi,

Could you try the dev image to check if it solves the problem? You should add these lines to your values-override.yaml or values.yaml file:

# cron full report schedule
report:
  veleroWatchdogReport:
    image:
      tag: dev

# daemon
watchdog:
  veleroMonitoring:
    image:
      tag: dev

And this setting to the configmap velero-watchdog-config:

K8S_VELERO_NAMESPACE=sa-velero-ui (the name of the namespace where vmware-tanzu/velero is deployed)

Environment variables are read at watchdog startup. If it is running, restart it.

rchekhina commented 4 months ago

HI Davide,

I have changed the image tag to dev and added the env on the configmap. I have restarted the pod but I have the same issue.

Best regards.

davideserio commented 4 months ago

Hi, to replicate this issue, could you please provide (or confirm) the following information:

Please obscure any personal data in the ConfigMaps and the values-override.yaml file.

rchekhina commented 4 months ago

HI Davide,

velero-api configmap:

  API_ENABLE_DOCUMENTATION: "1"
  API_ENDPOINT_PORT: "8001"
  API_ENDPOINT_URL: 0.0.0.0
  API_RATE_LIMITER_CUSTOM_1: Security:xxx:60:20
  API_RATE_LIMITER_L1: "60:20"
  API_TOKEN_EXPIRATION_MIN: "30"
  API_TOKEN_REFRESH_EXPIRATION_DAYS: "7"
  AWS_ACCESS_KEY_ID: XXXXXXXXXXXXXXXXXXX
  AWS_SECRET_ACCESS_KEY: XXXXXXXXXXXXX
  DEBUG_LEVEL: info
  DOWNLOAD_TMP_FOLDER: /tmp/velero-api
  K8S_IN_CLUSTER_MODE: "True"
  K8S_VELERO_NAMESPACE: sa-velero
  K8S_VELERO_UI_NAMESPACE: sa-velero-ui
  ORIGINS_1: https://velero.mydomain.com
  REPORT_CRONJOB_NAME: velero-ui-vui-report
  RESTIC_PASSWORD: static-passw0rd
  SECURITY_DISABLE_USERS_PWD_RATE: "1"
  SECURITY_PATH_DATABASE: ./data
  VELERO_CLI_DEST_PATH: /usr/local/bin
  VELERO_CLI_PATH: ./velero-client
  VELERO_CLI_PATH_CUSTOM: ./velero-client-binary
  VELERO_CLI_VERSION: v1.12.2
  VELERO_WATCHDOG_PORT: "8001"
  VELERO_WATCHDOG_URL: velero-ui-vui-watchdog-clusterip

watchdog configmap:

  API_ENDPOINT_URL: 0.0.0.0
  BACKUP_ENABLE: "True"
  DEBUG_LEVEL: Info
  EMAIL_ACCOUNT: <your-email>
  EMAIL_ENABLE: "False"
  EMAIL_PASSWORD: <your-password>
  EMAIL_RECIPIENTS: <email-recipents-comma-saparted>
  EMAIL_SMTP_PORT: <smtp-port>
  EMAIL_SMTP_SERVER: <smtp-server>
  EXPIRES_DAYS_WARNING: "29"
  K8S_INCLUSTER_MODE: "True"
  K8S_VELERO_NAMESPACE: sa-velero-ui
  NOTIFICATION_SKIP_COMPLETED: "False"
  NOTIFICATION_SKIP_DELETING: "True"
  NOTIFICATION_SKIP_INPROGRESS: "False"
  NOTIFICATION_SKIP_REMOVED: "True"
  PROCESS_CLUSTER_NAME: k8s-stg1-global-001
  PROCESS_CYCLE_SEC: "300"
  REPORT_BACKUP_ITEM_PREFIX: k8s-stg1-global-001
  REPORT_SCHEDULE_ITEM_PREFIX: k8s-stg1-global-001
  SCHEDULE_ENABLE: "True"
  SLACK_CHANNEL: '#infrastructure-web'
  SLACK_ENABLE: "True"
  SLACK_TOKEN: xoxb-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  TELEGRAM_CHAT_ID: <telegram-chat-id>
  TELEGRAM_ENABLE: "False"
  TELEGRAM_TOKEN: <telegram-token>

namespace velero: sa-velero namespace velero-ui: sa-velero-ui

values:

#
# Global
#
global:
  veleroNamespace: sa-velero
  kubernetesClusterDomain: cluster.local

#
# API Config
#
apiConfig:
  #
  awsAccessKeyId: XXXXXXXXXXXXXXXXX
  awsSecretAccessKey: XXXXXXXXXXXXXXXXX
  origins1: 'https://velero.mydomain.com'
  # debugLevel: debug
  veleroCliVersion: v1.12.2
  # apiTokenExpirationMin: "30"
  # apiTokenRefreshExpirationDays: "7"
  storage:
    enabled: false
    storageClassName: <your-storage-class-name>

#
# You can use nodeport or ingress according to your needs
#
#
# Nodeport
#
uiNp:
  enabled: false
  ip: "10.10.0.100"  # any ip address of your cluster
  apiPort: "30001"
  uiPort: "30002"
#
# Ingress
#
uiIngress:
  enabled: true
  # ingressClassName: nginx
  host: velero.mydomain.com
  tls:
    enabled: true
  metadata:
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-staging-v1" 
      nginx.ingress.kubernetes.io/whitelist-source-range: "xxx.xxx.xxx.xxx/32"
  spec:
    tls:
     - hosts:
       - velero.mydomain.com
       secretName: velero-ui-tls

#
# Watchdog Cron
#
report:
  schedule: 0 8 * * *
  veleroWatchdogReport:
    image:
      tag: dev

#
# Watchdog Daemon
#
watchdog:
  veleroMonitoring:
    image:
      tag: dev
watchdogConfig:
  # config
  k8SInclusterMode: "True"
  processClusterName: k8s-stg1-global-001
  # processCycleSec: 300
  # expiresDaysWarning: 29
  notificationSkipCompleted: "False"
  notificationSkipInProgress: "False"
  reportBackupItemPrefix: "k8s-stg1-global-001"
  reportScheduleItemPrefix: "k8s-stg1-global-001"

  # email
  emailEnable: "False"
  emailAccount: <your-email>
  emailPassword: <your-password>
  emailRecipients: <email-recipents-comma-saparted>
  emailSmtpPort: <smtp-port>
  emailSmtpServer: <smtp-server>

  # slack
  slackEnable: "True"
  slackChannel: "#infrastructure-web"
  slackToken: xoxb-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

  # telegram
  telegramEnable: "False"
  telegramChatId: "<telegram-chat-id>"
  telegramToken: "<telegram-token>"

Best regards.

davideserio commented 4 months ago

In the watchdog ConfigMap, you should set:

K8S_VELERO_NAMESPACE: sa-velero

instead of:

K8S_VELERO_NAMESPACE: sa-velero-ui

K8S_VELERO_NAMESPACE is the field for the name of the namespace where vmware-tanzu/velero is deployed.

Update the ConfigMap and restart the watchdog pod (always dev tag). Let me know if working

rchekhina commented 4 months ago

HI Davide,

ok this resolved the report issue:

Cluster: k8s-stg1-global-001
Namespaces:
    • total=83
    • unscheduled=79
Backups (based on last backup for every schedule and backup without schedule)
    • total=4
    • completed=4

But I still didnt receive slack notification when a backup start.

Best regards.

davideserio commented 4 months ago

Well, in the next release, we will merge the changes contained in the dev image, and you will be able to use the non-development image.

Hi,

We have released a version v0.1.7 with some message improvements.

For now, as a workaround for receiving notifications when a backup starts, you can set notificationSkipInProgress to "false". The watchdog will send you a notification when it detects a new backup in progress. (workaround limit: if the backup duration is less than PROCESS_CYCLE_SEC, you may only receive messages for completed backups).

As mentioned, for now, there is only this workaround. You could lower the PROCESS_CYCLE_SEC, but I don't think it makes sense to run these checks too frequently.

We are currently developing other improvements. Later, we will consider how we can enhance the notification at backup startup.

rchekhina commented 4 months ago

HI Davide,

Ok, thanks for the update.

Best regards.

On Mon, May 27, 2024 at 12:04 PM Davide @.***> wrote:

Well, in the next release, we will merge the changes contained in the dev image, and you will be able to use the non-development image.

Hi,

We have released a version v0.1.7 with some message improvements.

For now, as a workaround for receiving notifications when a backup starts, you can set notificationSkipInProgress https://github.com/seriohub/velero-helm/blob/47302aa3bf9feabe4c6b3f79b796c5ea2c5e02bf/chart/values.yaml#L169 to "false". The watchdog will send you a notification when it detects a new backup in progress. (workaround limit: if the backup duration is less than PROCESS_CYCLE_SEC, you may only receive messages for completed backups).

As mentioned, for now, there is only this workaround. You could lower the PROCESS_CYCLE_SEC, but I don't think it makes sense to run these checks too frequently.

We are currently developing other improvements. Later, we will consider how we can enhance the notification at backup startup.

— Reply to this email directly, view it on GitHub https://github.com/seriohub/velero-helm/issues/10#issuecomment-2133129922, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASWWRPDP5Y5OBUHDH3LF763ZEMAJRAVCNFSM6AAAAABHQKDTNGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZTGEZDSOJSGI . You are receiving this because you authored the thread.Message ID: @.***>

-- Reynald CHEKHINA SA, Square1

*e: @. @.>* Company Registration Number: 528714 VAT number IE3178050PH