WiseLibs / better-sqlite3

The fastest and simplest library for SQLite3 in Node.js.
MIT License
5.47k stars 395 forks source link

database malformed issue #877

Closed cisaacson closed 2 years ago

cisaacson commented 2 years ago

We are running a SQLite database on SQLite version 3.38.5, using better-sqlite3 version ^7.4.0. We are getting this error from SQLite: database disk image is malformed. This happens when we try and perform a query.

When we check the database through the sqlite3 command line utility, the integrity check shows the database file is fine, and we can perform reads and writes to the database.

Since we know the database integrity is OK, any idea why we would see this error?

cisaacson commented 2 years ago

Looks like the 7.4.0 version was not working correctly with this version of SQLite. We upgraded to 7.6.2 and everything worked fine. Closing this issue.

Great product by the way!

cisaacson commented 2 years ago

We are still seeing this issue, even after upgrading. I think it is related to another node module that is writing to the SQLite db. better-sqlite is mostly reading from the db.

Any idea what could cause this? We definitely are getting SQLITE_CORRUPT from better-sqlite, yet the database checks out fine with the PRAGMA integrity_check;.

Not sure if it is related, but we maintain a single persistent better-sqlite connection for the application. The other component that is writing uses a connection pool and connections can close after a few minutes idle time.

Any ideas appreciated.

Prinzhorn commented 2 years ago

We are running a SQLite database on SQLite version 3.38.5

I don't think that's how it works. Each process runs a specific SQLite version. The database file itself does not have a "SQLite version". With better-sqlite3 7.6.2 you are now running SQLite 3.39.1.

I think it is related to another node module that is writing to the SQLite db.

What version does it use? You will get SQLITE_CORRUPT when you try reading/writing a database file that uses features that your SQLite version does not have. E.g. if you have a STRICT table and try using that table with SQLite <=3.36.0 it will choke on the DDL and SQLITE_CORRUPT.

The other component that is writing uses a connection pool and connections can close after a few minutes idle time.

The concept of connection pools does not make much sense with SQLite. There is no networking. If anything your pooling logic and the overhead makes everything slower, not faster.

cisaacson commented 2 years ago

@Prinzhorn thanks for the quick response.

What version does it use?

The other library uses SQLite version 3.39.0, that is a native Rust module we have that is the only writer to the SQLite db. We do not get any errors with that functionality, only with better-sqlite which is reading.

You will get SQLITE_CORRUPT when you try reading/writing a database file that uses features that your SQLite version does not have.

I doubt we are using a feature that better-sqlite does not support, so what could cause this on a read? And how do we find out what is causing the error? It is happening on a query, nothing out of the ordinary, and it only happens after our application has been running for some time (not at the start). The same query runs fine from the sqlite3 command line utility, whether 3.38 or 3.39.

The concept of connection pools does not make much sense with SQLite.

I agree that connection pooling doesn't make much sense, we will remove that, but I don't think it has anything to do with the error based on your feedback. However, the connection pool is closing connections after some time, which removes the WAL file from the database (correct behavior). Could that be throwing off better-sqlite? Or a threading/contention issue?

cisaacson commented 2 years ago

We have confirmed that the error only shows up after the Rust connection pool times out and the WAL file is no longer there. It works fine before then. While we are removing the connection pool, it would be good to know how to track this down. We also confirmed that a simple query is what is generating the error.

Prinzhorn commented 2 years ago

I doubt that this is related to better-sqlite3. The error comes from SQLite itself, which better-sqlite3 does not touch. You are probably doing something funny that SQLite does not like.

which removes the WAL file from the database (correct behavior). and the WAL file is no longer there.

That doesn't sound correct (to me at least). Are you saying your better-sqlite3 connection is not using WAL but the Rust connections are? I don't know what the SQLite docs say about mixing different journal_mode or if you're simply running into a SQLite bug when a journal_mode=WAL connection closes while a journal_mode=DELETE connection is active.

cisaacson commented 2 years ago

@Prinzhorn The documentation says that when using WAL mode and all connections are closed, the WAL file will be cleaned up. The connection pool is closing all of its connections after 10 minutes, and we seen the WAL file go away at that point.

For the better-sqlite connection we never close it (so SQLite should think there is still a connection open). Do we need to connect it using an option for journal_mode=WAL? We don't do that anywhere else, it is set in the database itself with an initialization batch.

Is there any possibility that the better-sqlite connection is closing somehow? We do that explicitly, but only when our application shuts down.

Prinzhorn commented 2 years ago

The documentation says that when using WAL mode and all connections are closed, the WAL file will be cleaned up.

But there are still connections open from better-sqlite3?

it is set in the database itself with an initialization batch.

What does that mean? The database doesn't do anything on its own.

Do we need to connect it using an option for journal_mode=WAL?

A PRAGMA. Give it a try, I assume that this is the culprit. better-sqlite3 is not connecting using WAL and SQLite will say the db is corrupt when you close all the WAL connections (which potentially is a SQLite bug, idk)

cisaacson commented 2 years ago

A PRAGMA. Give it a try, I assume that this is the culprit. better-sqlite3 is not connecting using WAL and SQLite will say the db is corrupt when you close all the WAL connections (which potentially is a SQLite bug, idk)

That makes the most sense, we will try and let you know.

cisaacson commented 2 years ago

@Prinzhorn The connection pool closing the connections was indeed the issue. We disabled that and now the WAL file stays in place. It is odd that SQLite is not recognizing the better-sqlite connection (which should preserve the WAL file), so you are right, it may be an issue on their side.

We did try the PRAGMA too for journal_mode, that made it so we did not see that error again, but queries just failed to work (no error message we could see).

This is fine now with the connection pool timeouts removed. We are also going to take your advice and remove the pool, that is unnecessary and things will be simpler without it.

Thanks again for the help, we can close this now.