greenelab / connectivity-search-backend

Django backend for hetnet connectivity search
https://search-api.het.io
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

Flush of AWS database not working #16

Open dhimmel opened 5 years ago

dhimmel commented 5 years ago

I am attempting to wipe the prototype database and re-populate it using the new hetmatpy version added in #15.

However, the following command seems to run indefinitely without returning or erroring:

python manage.py flush --no-input

When the database was a local postgres instance in a Docker, this command took at most a few seconds. @dongbohu any ideas?

dongbohu commented 5 years ago

Try psql command maybe.

dhimmel commented 5 years ago

I am thinking this is because there may be existing transactions that are not closed, therefore the flush command cannot proceed (according to this StackOverflow).

The pg_stat_activity command shows we have running processes. I try to kill them with:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity;

However, we get the following error:

ERROR:  must be a superuser to terminate superuser process

I couldn't make dj_hetmech a superuser:

ALTER USER dj_hetmech WITH SUPERUSER;

Turns out AWS does not allow users superuser access. I was able to get the command to work by omitting termination of rdsadmin processes.

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE usename <> 'rdsadmin';

This booted me from my psql session, however the django flush command still stalls.

dhimmel commented 5 years ago

Trying from within a psql session:

DROP DATABASE IF EXISTS dj_hetmech;

Returned:

ERROR:  cannot drop the currently open database

Therefore, I try the dropdb shell command:

dropdb --host=$RDS_URL --user=dj_hetmech dj_hetmech

This fails with error:

dropdb: database removal failed: ERROR:  database "dj_hetmech" is being accessed by other users
DETAIL:  There are 5 other sessions using the database.
dhimmel commented 5 years ago

I tried method from https://dba.stackexchange.com/a/163668:

-- Connecting to the current user localhost's postgres instance
psql

-- Making sure the database exists
SELECT * FROM pg_database WHERE datname = 'dj_hetmech';

-- Disallow new connections
UPDATE pg_database SET datallowconn = 'false' WHERE datname = 'dj_hetmech';
ALTER DATABASE dj_hetmech CONNECTION LIMIT 1;

-- Terminate existing connections
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'dj_hetmech';

-- Drop database
DROP DATABASE dj_hetmech;

Only got to terminate existing connections, then psql disconnected from the database. Now when I try to connect in a new psql session, I receive:

psql: FATAL:  too many connections for database "dj_hetmech"

This is probably because I changed the connection limit.

@dongbohu, I may need you to take over here.

dhimmel commented 5 years ago

Okay @dongbohu rebooted the database, allowing me to login and remove the connection limit:

ALTER DATABASE dj_hetmech CONNECTION LIMIT -1;

Then I could run

-- Switch to postgres database
\connect postgres
DROP DATABASE dj_hetmech;

Then we ran the following to recreate the database:

createdb --host=$RDS_URL --user=dj_hetmech dj_hetmech
dhimmel commented 5 years ago

Notes: next time the method in https://github.com/greenelab/hetmech-backend/issues/16#issuecomment-460691145 may work, however, we must not run the command from the dj_hetmech database.

dhimmel commented 5 years ago

Success

For https://github.com/greenelab/hetmech-backend/pull/18, I used the following workflow to delete the database and recreate an empty one:

-- Switch to postgres database (or set --db-user to connect directly to it)
\connect postgres
-- Terminate existing dj_hetmech connections
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'dj_hetmech';
-- Drop dj_hetmech database
DROP DATABASE dj_hetmech;
-- Create an empty dj_hetmech database
CREATE DATABASE dj_hetmech;
ben-heil commented 5 years ago

Whenever the database is repopulated (or at least the past two times), the read_only_user doesn't have any permissions for the database

In the current version of the database read_only_user works fine after manually adding permissions, but I think there should be a way to add permissions automatically when reloading the database