ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Need Help with Arctos Error - FATAL: the database system is in recovery mode #4896

Closed Jegelewicz closed 2 years ago

Jegelewicz commented 2 years ago

Error Text

ERROR_ID | 0A2106D9-0319-4996-A896C5C4534C5DED -- | -- ERROR_TYPE | SQL ERROR_MESSAGE | FATAL: the database system is in recovery mode ERROR_DETAIL |   ERROR_SQL |  

Where it happened

https://arctos.database.museum/guid/NHSM:Geol:1710

Steps to get there

Edit identification, select save

Problem

Community response to describe the problem that caused the error

Solution

Community response with directions for how to correct the problem

Jegelewicz commented 2 years ago

Seems like this resolved itself - but was kinda scary?

dustymc commented 2 years ago

Neither TACC nor I can see anything scary going on. Glitch in the matrix....

dustymc commented 2 years ago

FYI: Chris was able to trace this back to conflicting locks (ultimately caused by https://github.com/ArctosDB/arctos/issues/4758).

I built a new function that ignores noninteger data, which should deal with that but does add a (relatively small) cost.

I think this is essentially a 'that's why banks pay for Oracle' error.

Jegelewicz commented 2 years ago

It did happen to me again - it seems like it is when I have edited an identification and the stuff hasn't caught up in flat (the stuff in the catalog record header disagrees with the stuff in the identification). It eventually works itself out.

dustymc commented 2 years ago

Yep, Chris caught that one too, a segfault I think caused by the same mess bouncing around, looks like everything's recovering anyway.

There's a 'and tell flat to update when it gets around to it' component to updating identifications, even tiny updates require locks, and in this case flat and filtered_flat are fighting for a lock (because the error caused by mistyped data) and we didn't pay for Oracle so they just keep going at it until someone runs out of something and keels over....

This particular set of circumstances no longer exists so this rare situation should be a bit more rare.