transientskp / tkp

A transients-discovery pipeline for astronomical image-based surveys
http://docs.transientskp.org/
BSD 2-Clause "Simplified" License
19 stars 15 forks source link

Database lockout #543

Closed mkuiack closed 7 years ago

mkuiack commented 7 years ago

While processing a batch of images TraP failed when there was an attempt to delete inactive entries from the runningcatalog table. One of the entires was still referenced in the varmetric table. This causes an issue with "foreign key constraints" and I don't know enough (any) SQL to figure out what's going on. The database is now completely locked up so that I cannot just rerun the batch of images with a new dataset.

from the trap.log:

2017-01-07 04:46:21 INFO tkp.main: processing 1 images in timestep 2016-09-05 18:17:15.310000 (5343/10604)
2017-01-07 04:46:23 INFO tkp.main: found 33 blind sources in 1 images
2017-01-07 04:46:24 INFO tkp.main: performed 22 forced fits in 1 images
2017-01-07 04:46:24 INFO tkp.main: calculating variability metrics
2017-01-07 04:46:37 INFO tkp.main: processing 1 images in timestep 2016-09-05 18:17:16.317000 (5344/10604)
2017-01-07 04:46:39 INFO tkp.main: found 33 blind sources in 1 images
2017-01-07 04:46:40 INFO tkp.main: performed 23 forced fits in 1 images
2017-01-07 04:46:40 INFO tkp.main: calculating variability metrics
2017-01-07 04:46:53 INFO tkp.main: processing 1 images in timestep 2016-09-05 18:17:17.324000 (5345/10604)
2017-01-07 04:46:55 INFO tkp.main: found 23 blind sources in 1 images
2017-01-07 04:46:59 ERROR tkp.db.database: Query failed: (psycopg2.IntegrityError) update or delete on table "runningcatalog" violates foreign key constraint "varmetric_runcat_fkey" on table "varmetric"
DETAIL:  Key (id)=(2205) is still referenced from table "varmetric".
 [SQL: 'DELETE\n  FROM runningcatalog\n WHERE inactive = TRUE\n']. Query: DELETE
  FROM runningcatalog
 WHERE inactive = TRUE
.
2017-01-07 04:46:59 ERROR tkp.main: timestep raised <class 'sqlalchemy.exc.IntegrityError'> exception: (psycopg2.IntegrityError) update or delete on table "runningcatalog" violates foreign key constraint "varmetric_runcat_fkey" on table "varmetric"
DETAIL:  Key (id)=(2205) is still referenced from table "varmetric".
 [SQL: 'DELETE\n  FROM runningcatalog\n WHERE inactive = TRUE\n']
2017-01-07 04:46:59 INFO tkp.main: processing 1 images in timestep 2016-09-05 18:17:18.330000 (5346/10604)
2017-01-07 04:46:59 ERROR tkp.db.database: Query failed: (psycopg2.InternalError) current transaction is aborted, commands ignored until end of transaction block
 [SQL: 'SELECT * FROM dataset WHERE id=%s'] [parameters: (2,)]. Query: SELECT * FROM dataset WHERE id=2.
2017-01-07 04:46:59 ERROR tkp.main: timestep raised <class 'sqlalchemy.exc.InternalError'> exception: (psycopg2.InternalError) current transaction is aborted, commands ignored until end of transaction block
 [SQL: 'SELECT * FROM dataset WHERE id=%s'] [parameters: (2,)]
2017-01-07 04:46:59 INFO tkp.main: processing 1 images in timestep 2016-09-05 18:17:19.337000 (5347/10604)
2017-01-07 04:46:59 ERROR tkp.db.database: Query failed: (psycopg2.InternalError) current transaction is aborted, commands ignored until end of transaction block
 [SQL: 'SELECT * FROM dataset WHERE id=%s'] [parameters: (2,)]. Query: SELECT * FROM dataset WHERE id=2.
2017-01-07 04:46:59 ERROR tkp.main: timestep raised <class 'sqlalchemy.exc.InternalError'> exception: (psycopg2.InternalError) current transaction is aborted, commands ignored until end of transaction block
 [SQL: 'SELECT * FROM dataset WHERE id=%s'] [parameters: (2,)]
mkuiack commented 7 years ago

I've tried manually

(trapvenv)mkuiack@struis:/scratch/mkuiack
$ psql -h vlo.science.uva.nl -U mkuiack -c "DELETE FROM varmetric WHERE id=2205;" nobeam_16SB
Password for user mkuiack: 
DELETE 0

then

(trapvenv)mkuiack@struis:/scratch/mkuiack
$ psql -h vlo.science.uva.nl -U mkuiack -c "DELETE FROM runningcatalog WHERE inactive= TRUE;" nobeam_16SB
Password for user mkuiack: 
ERROR:  update or delete on table "runningcatalog" violates foreign key constraint "varmetric_runcat_fkey" on table "varmetric"
DETAIL:  Key (id)=(2205) is still referenced from table "varmetric".
gijzelaerr commented 7 years ago

you cant delete runcat entries that are referenced by other objects in your database. This is a feature, you don't want objects with dangling references. Depending on what you want to do, you can manually remove the entries in the varmetric tables that are referencing to the runcat entries you want to do delete.

or, if you are lazy, you can do a cascaded delete, google SQL delete casacade. I'm not sure how to do that also. But be careful if you do such a thing not carefully you may end up with an empty database.

mkuiack commented 7 years ago

Not sure what caused this bug but the solution was:

$ psql -h vlo.science.uva.nl -U mkuiack -c "DELETE FROM varmetric WHERE runcat = 2205;" nobeam_16SB
Password for user mkuiack: 
DELETE 1
(trapvenv)mkuiack@struis:/scratch/mkuiack
$ psql -h vlo.science.uva.nl -U mkuiack -c "DELETE FROM runningcatalog WHERE inactive= TRUE;" nobeam_16SB
Password for user mkuiack: 
DELETE 1

Trap is now running thought the images again.

gijzelaerr commented 7 years ago

this iissue has been resolved right @mkuiack ? if so, please close this issue.