nixys / nxs-backup

The tool for creating, delivering and rotating backups for GNU/Linux distributions.
https://nxs-backup.io
Apache License 2.0
256 stars 13 forks source link

SMB connection reset on timeout when external dump script runs too long (15 minutes or more) #102

Open uralm1 opened 1 week ago

uralm1 commented 1 week ago

Имеем задание:

job_name: gitlab_art
type: external
dump_cmd: _ext_dump_art.sh

storages_options:
  - storage_name: smb_share
...
storage_connects:
  - name: smb_share
    smb_params:
       host: winsrv
       port: 445
 ...
       connection_timeout: 10

Когда время выполнения скрипта _ext_dump_art.sh очень большое, >15 мин в моем случае (он создает архив 50гб), то соединение по smb отваливается по таймауту и задание завершается с ошибкой:

INFO [2024-11-10 21:36:53.345] Backup starting.
DEBUG [2024-11-10 21:36:53.345][gitlab_art] Starting rotate outdated backups.
INFO [2024-11-10 21:36:53.345][gitlab_art] Starting
DEBUG [2024-11-10 21:36:53.345][gitlab_art] Dump cmd: /srv/gitlab/backup/_ext_dump_art.sh
INFO [2024-11-10 21:36:53.345][gitlab_art] Starting of `/srv/gitlab/backup/_ext_dump_art.sh`
INFO [2024-11-10 22:00:21.903][gitlab_art] Dumping completed
DEBUG [2024-11-10 22:00:21.904][gitlab_art] STDOUT: {"full_path":"/opt/gitlab/data/backups/213653_2024_11_10_art_gitlab_backup.tar"}

DEBUG [2024-11-10 22:00:21.904][gitlab_art] Created temp backup /opt/gitlab/data/backups/213653_2024_11_10_art_gitlab_backup.tar.
ERROR [2024-11-10 22:00:21.905][gitlab_art](dras-build2) Unable to create remote directory 'gitlab_art/daily': 'mkdir gitlab_art: connection error: read tcp 192.168.1.1:35036->192.168.1.2:445: read: connection reset by peer'
ERROR [2024-11-10 22:00:21.905][gitlab_art](dras-build2) Unable to upload tmp backup
ERROR [2024-11-10 22:00:21.905][gitlab_art] Failed to create temp backup.
INFO [2024-11-10 22:00:22.645][gitlab_art] Finished
INFO [2024-11-10 22:00:22.645] Backup finished.

ERROR [2024-11-10 22:00:22.647] cmd routine fail: (details: Some of backups failed with next errors:
1 error occurred:
        * mkdir gitlab_art: connection error: read tcp 192.168.1.1:35036->192.168.1.2:445: read: connection reset by peer

)

Если external скрипт работает быстро (размер создаваемого файла 4-8 гб), проблема отсутствует.

Эта проблема описана в issue 68 библиотеки go-smb2: https://github.com/hirochachacha/go-smb2/issues/68 Нужно или делать Dial повторно, или не допускать длительных пауз после smb подключения.