jazzband / django-dbbackup

Management commands to help backup and restore your project database and media files
BSD 3-Clause "New" or "Revised" License
984 stars 219 forks source link

pg_restore: error: error returned by PQputCopyData: SSL connection has been closed unexpectedly #396

Open matthieudesprez opened 3 years ago

matthieudesprez commented 3 years ago

Bug Report

Describe the bug

I'm experiencing issues running the dbrestore command on a remote server (EC2) while the dbbackup is running fine.

Locally, both dbbackup and dbrestore are working succesfully and I can't find why it behaves differently in a different environment.

Screenshots or reproduction

root@cb84c9d64783:/app# python manage.py dbrestore
Finding latest backup
INFO:dbbackup.command:Finding latest backup
Restoring backup for database 'default' and server 'None'
INFO:dbbackup.command:Restoring backup for database 'default' and server 'None'
Restoring: default-cb84c9d64783-2021-08-06-184602.psql.bin
INFO:dbbackup.command:Restoring: default-cb84c9d64783-2021-08-06-184602.psql.bin
Restore tempfile created: 127.0 MiB
INFO:dbbackup.command:Restore tempfile created: 127.0 MiB
Are you sure you want to continue? [Y/n]
Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 330, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.8/site-packages/dbbackup/management/commands/dbrestore.py", line 53, in handle
    self._restore_backup()
  File "/usr/local/lib/python3.8/site-packages/dbbackup/management/commands/dbrestore.py", line 94, in _restore_backup
    self.connector.restore_dump(input_file)
  File "/usr/local/lib/python3.8/site-packages/dbbackup/db/base.py", line 92, in restore_dump
    result = self._restore_dump(dump)
  File "/usr/local/lib/python3.8/site-packages/dbbackup/db/postgresql.py", line 125, in _restore_dump
    stdout, stderr = self.run_command(cmd, stdin=dump, env=self.restore_env)
  File "/usr/local/lib/python3.8/site-packages/dbbackup/db/postgresql.py", line 21, in run_command
    return super(PgDumpConnector, self).run_command(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/dbbackup/db/base.py", line 150, in run_command
    raise exceptions.CommandConnectorError(
dbbackup.db.exceptions.CommandConnectorError: Error running:  pg_restore --dbname=<x> --host=x-rds-proxy-x.eu-central-1.rds.amazonaws.com --port=5432 --user=<x> --no-password --single-transaction --clean
pg_restore: error: error returned by PQputCopyData: SSL connection has been closed unexpectedly

Versions

Django-dbbackup

External tools

Any help appreciated

matthieudesprez commented 3 years ago

After further analysis running pg_restore with --verbose mode, this issue seems to be caused by processing large tables of 1M+ rows, I suspect the remote server not having enough memory for sending large copy requests from the dump.

Running pg_dump directly I manage to find a way through this thanks to --exclude-table-data and --inserts which replaces copy with insert statements (way slower but at least it works).

Btw I've seen the --exclude-table-data argument has been merged on master not so long ago, but the last release is before that, is there any plan to make this available ?

matthieudesprez commented 3 years ago

If someone encounters the same issue, I was able to keep using dbbackup and not executing manually pg_dump by defining my own connector and basically overwriting the pg_dump command arguments.

from dbbackup.db.postgresql import PgDumpConnector

class PgDumpBinaryConnector(PgDumpConnector):
    extension = "psql.bin"
    dump_cmd = "pg_dump"
    restore_cmd = "pg_restore"

    def _create_dump(self):
        cmd = "{} {}".format(self.dump_cmd, self.settings["NAME"])
        if self.settings.get("HOST"):
            cmd += " --host={}".format(self.settings["HOST"])
        if self.settings.get("PORT"):
            cmd += " --port={}".format(self.settings["PORT"])
        if self.settings.get("USER"):
            cmd += " --user={}".format(self.settings["USER"])
        cmd += " --no-password"
        cmd += " --format=custom"

        exclude_tables = (
            "django_celery_beat_clockedschedule",
            "django_celery_beat_crontabschedule",
            "django_celery_beat_intervalschedule",
            "django_celery_beat_periodictask",
            "django_celery_beat_periodictasks",
            "django_celery_beat_solarschedule",
            "django_celery_results_chordcounter",
            "django_celery_results_taskresult",
            "django_admin_log",
            "django_content_type",
            "django_migrations",
            "django_session",
        )

        for table in exclude_tables:
            cmd += f" --exclude-table-data={table}"

        cmd += " --inserts --rows-per-insert=10000"

        cmd = "{} {} {}".format(self.dump_prefix, cmd, self.dump_suffix)
        stdout, stderr = self.run_command(cmd, env=self.dump_env)
        return stdout

    def _restore_dump(self, dump):
        cmd = "{} --dbname={}".format(self.restore_cmd, self.settings["NAME"])
        if self.settings.get("HOST"):
            cmd += " --host={}".format(self.settings["HOST"])
        if self.settings.get("PORT"):
            cmd += " --port={}".format(self.settings["PORT"])
        if self.settings.get("USER"):
            cmd += " --user={}".format(self.settings["USER"])
        cmd += " --no-password"
        cmd += " --single-transaction"
        cmd += " --data-only"
        cmd = "{} {} {}".format(self.restore_prefix, cmd, self.restore_suffix)
        stdout, stderr = self.run_command(cmd, stdin=dump, env=self.restore_env)
        return stdout, stderr

and referencing it through:

DBBACKUP_CONNECTOR_MAPPING = {
    "app_pg_db_wrapper": "path_to_custom.PgDumpBinaryConnector",
}