Open matthieudesprez opened 3 years ago
After further analysis running pg_restore with --verbose mode, this issue seems to be caused by processing large tables of 1M+ rows, I suspect the remote server not having enough memory for sending large copy requests from the dump.
Running pg_dump directly I manage to find a way through this thanks to --exclude-table-data and --inserts which replaces copy with insert statements (way slower but at least it works).
Btw I've seen the --exclude-table-data argument has been merged on master not so long ago, but the last release is before that, is there any plan to make this available ?
If someone encounters the same issue, I was able to keep using dbbackup and not executing manually pg_dump by defining my own connector and basically overwriting the pg_dump command arguments.
from dbbackup.db.postgresql import PgDumpConnector
class PgDumpBinaryConnector(PgDumpConnector):
extension = "psql.bin"
dump_cmd = "pg_dump"
restore_cmd = "pg_restore"
def _create_dump(self):
cmd = "{} {}".format(self.dump_cmd, self.settings["NAME"])
if self.settings.get("HOST"):
cmd += " --host={}".format(self.settings["HOST"])
if self.settings.get("PORT"):
cmd += " --port={}".format(self.settings["PORT"])
if self.settings.get("USER"):
cmd += " --user={}".format(self.settings["USER"])
cmd += " --no-password"
cmd += " --format=custom"
exclude_tables = (
"django_celery_beat_clockedschedule",
"django_celery_beat_crontabschedule",
"django_celery_beat_intervalschedule",
"django_celery_beat_periodictask",
"django_celery_beat_periodictasks",
"django_celery_beat_solarschedule",
"django_celery_results_chordcounter",
"django_celery_results_taskresult",
"django_admin_log",
"django_content_type",
"django_migrations",
"django_session",
)
for table in exclude_tables:
cmd += f" --exclude-table-data={table}"
cmd += " --inserts --rows-per-insert=10000"
cmd = "{} {} {}".format(self.dump_prefix, cmd, self.dump_suffix)
stdout, stderr = self.run_command(cmd, env=self.dump_env)
return stdout
def _restore_dump(self, dump):
cmd = "{} --dbname={}".format(self.restore_cmd, self.settings["NAME"])
if self.settings.get("HOST"):
cmd += " --host={}".format(self.settings["HOST"])
if self.settings.get("PORT"):
cmd += " --port={}".format(self.settings["PORT"])
if self.settings.get("USER"):
cmd += " --user={}".format(self.settings["USER"])
cmd += " --no-password"
cmd += " --single-transaction"
cmd += " --data-only"
cmd = "{} {} {}".format(self.restore_prefix, cmd, self.restore_suffix)
stdout, stderr = self.run_command(cmd, stdin=dump, env=self.restore_env)
return stdout, stderr
and referencing it through:
DBBACKUP_CONNECTOR_MAPPING = {
"app_pg_db_wrapper": "path_to_custom.PgDumpBinaryConnector",
}
Bug Report
Describe the bug
I'm experiencing issues running the dbrestore command on a remote server (EC2) while the dbbackup is running fine.
Locally, both dbbackup and dbrestore are working succesfully and I can't find why it behaves differently in a different environment.
Screenshots or reproduction
Versions
Django-dbbackup
External tools
Any help appreciated