sqlitebrowser / dbhub.io

A "Cloud" for SQLite databases. Collaborative development for your data. 😊
https://dbhub.io
GNU Affero General Public License v3.0
372 stars 39 forks source link

Weird disk I/O error showing on webUI console #24

Closed justinclift closed 7 years ago

justinclift commented 7 years ago

A strange disk I/O error has appeared on the dev1 webUI console a few times over the last several weeks:

2017/04/11 08:27:20 Error retrieving table names: disk I/O error (SELECT name FROM sqlite_master WHERE type = 'table' AND name NOT LIKE 'sqlite_%' ORDER BY 1) (disk I/O error)
2017/04/11 08:27:31 Error retrieving table names: disk I/O error (SELECT name FROM sqlite_master WHERE type = 'table' AND name NOT LIKE 'sqlite_%' ORDER BY 1) (disk I/O error)
2017/04/11 08:27:48 Error retrieving table names when sanity checking upload: file is encrypted or is not a database (SELECT name FROM sqlite_master WHERE type = 'table' AND name NOT LIKE 'sqlite_%' ORDER BY 1) (file is encrypted or is not a database)

Relevant data points:

So, I'm thinking it's an error in our code. First thought is perhaps a disk flushing problem. We pull the target database out of Minio (where it's stored), write it to a temp file, then read from that using a Go SQLite library.

If the temp file (somehow) isn't fully written by the time we attempt to read from it, I can imagine hitting this kind of error.

As a first attempted workaround, we can try closing the temp file (which should flush it). Then re-open it and pass the re-opened handle to the SQLite library.

If that doesn't work... we'll probably need to add some kind of verbose debugging/logging code. Hopefully this isn't due to a bug in the SQLite library we're using. :wink:

chrisjlocke commented 7 years ago

Where is the database you're opening ... local (as far as SQLite is concerned) or on a network?

Thanks, Chris

justinclift commented 7 years ago

Local. Good thought though, but yeah, not an I/O issue on a network share (or similar). :smile:

chrisjlocke commented 7 years ago

.log stderr?

I'm mumbling words I don't understand, but picked up from here: ;) http://dba.stackexchange.com/questions/44156/disk-i-o-error-in-sqlite

Wondering if (like in that example) there was more to the error.... some more codes....

justinclift commented 7 years ago

Yeah, stdout and stderr are what's being logged. eg that's where the errors are showing up.

Nothing is showing up at an OS level though, so it doesn't seem like an OS or hardware level problem.

I'll add the disk flushing code in a bit, and we can just let that run and see if the problem still keeps showing up or not.

MKleusberg commented 7 years ago

The disk I/O error is an error message by SQLite which can be caused by all sorts of problems. So it's really hard to guess what's happening in this case.

@justinclift Here's an idea how we could narrow down the problem. Can you search for the 'Error retrieving table names [...]' error messages in our code and add a call to the int sqlite3_extended_errcode(sqlite3 *db); function before all occurrences of these log.Printfs? The return value of that function should be printed as well, so we can get a more specific error message (see https://sqlite.org/rescode.html#extrc).

justinclift commented 7 years ago

Interesting. Yep, I'll see if that can be done. It's even possible the SQLite library we use has that functionality already in use somewhere and I just need to learn about it. :smile:

MKleusberg commented 7 years ago

Looks like there indeed is a method for this in the library we use. See here. From what I understand it might be as easy as to call err.ExtendedCode() on the err objects and print the result but I might be mistaken :wink:

justinclift commented 7 years ago

Thanks, yep that looks like the right kind of thing.

I'll get to this after I've finished getting the initial "Settings" page working. That's turned out to be a bit more complicated than originally expected... though I kind of suspected it would. :wink:

justinclift commented 7 years ago

Pretty sure I've found the cause of this:

    https://github.com/sqlitebrowser/dbhub.io/blob/2f060bcdf95e2d75f11fd013eb4b4da16715e1ec/common/minio.go#L120

It mostly looks like the cleanup code is defined in the wrong place, so it runs prematurely. I've fixed it in the temp branch I'm working on, so once that's in a good state + squashed + merged we can keep an eye on dev1 and see if it shows up again. It shouldn't. In theory. :smile:

justinclift commented 7 years ago

Haven't seen this happen since the above fix, so closing this now. :smile: