GOCDB / gocdb-failover-scripts

Scripts and doc for hosting a failover
1 stars 2 forks source link

Close DB connections before trying to restore dump #6

Open gregcorbett opened 3 years ago

gregcorbett commented 3 years ago

If the failover database happens to be in use when 1_runDbUpdate.sh is run to update it, the update fails.

This is because the existing GOCDB5 user can't be dropped, see below.

Dropping GOCDB5 user failed [0]
SQL*Plus: Release 11.2.0.2.0 Production on Fri Jul 16 08:40:01 2021 Copyright (c) 1982, 2011, Oracle. All rights reserved. SQL> Connected. SQL> Directory created. SQL> DROP USER GOCDB5 CASCADE 1_runDbUpdate.sh check_db_dump_recent.py dropGocdbUser2.sh dropGocdbUser.sh failover_TEMPLATE.sh gatherStats.sh getDump.sh lastImportedDmpFile loadData.sh ora11gEnvVars.sh pass_file pass_file_exemplar.txt updateLog.txt ERROR at line 1: ORA-01940: cannot drop a user that is currently connected SQL> Disconnected from Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production

One solution would be to switch the webserver off (and back on) as part of 1_runDbUpdate.sh

ghost commented 3 years ago

The issue would probably be different in the case of MariaDB where database and users are different 'things' and in the mysqldump used for import, creating the DB and dropping tables before creating them again use 'IF EXISTS', and tables are locked before importing data. Even though it seems sensible to stop the web server before an import, why is the import running at all when the failover is engaged? This implies a second instance (the non-failover) is operating in write mode?

gregcorbett commented 3 years ago

There is a difference between the failover being "engaged" and the failover webserver happens to be in use at the time.

If the failover is engaged (i.e it is the primary instance serving production requests) then the import shouldn't be running - as worst case there is no database dump to pull and the scripts may fail ungracefully and drop the failover's database with nothing to restore.

The failover's webserver is always running currently however, and if a web request comes along at the right time and talks to the database just as the failover scripts wants to drop the existing data you get the above error.

ghost commented 3 years ago

If the incoming web request is 'legitimate', then maybe a better option would be to retry the import (some number of times) before failing? That way the legitimate user is less likely to be faced with a connection rejected.

gregcorbett commented 3 years ago

Retrying n times before resorting to shutting of the webserver seems like a good idea.