tursodatabase / libsql

libSQL is a fork of SQLite that is both Open Source, and Open Contributions.
https://turso.tech/libsql
MIT License
11.87k stars 294 forks source link

ERROR libsql::replication: replicator sync | database disk image is malformed #1607

Open Aft1n opened 4 months ago

Aft1n commented 4 months ago

I have an app deployed in production that has a replica db saved in the app folder, and today i checked the logs and saw this error:

ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed

Everything was working perfectly fine for couple of weeks, no changes to db schema were made, it just simply started to show this error in logs

Aft1n commented 4 months ago

Somehow my replica db got corrupted, after resyncing replica db locally and pushing it back to the server it got fixed.. What could have caused the problem?? Multiple user writes or something else?

haaawk commented 4 months ago

Sometimes embedded replicas get corrupted when their db file is opened with regular sqlite3 driver. Maybe you did something like that?

@LucioFranco Would you be able to help please?

Aft1n commented 4 months ago

Considering this file was deployed in production, i didn't touch it. Though i had some active traffic to the website before noticing this error, though it was mostly reads, and at first i thought that it was some kind of attack, like sql injection( im not very proficient in those issues)

But could multiple simultaneous writes and potential error corrupt it??

haaawk commented 4 months ago

It is not possible to write directly to embedded replicas. The writes are being forwarded to a primary in the cloud and then fetched back with sync. So multiple writers should be fine I think

Aft1n commented 4 months ago

Yeah, i know that. But if there was some race condition or some errors during writes, maybe this could potentially on sync corrupt replica dbs. Interesting thing was, that i have a full-stack app that has replica. And also a separate api that also has its own replica. And even though some actions were happening on the full-stack app, inside my api application DB was corrupted as well..

So basically there was something going on in turso parent db, and this error was synced to both replicas. But it worked totally fine with turso dashboard, and after deleting replicas and syncing again it got fixed.

haaawk commented 4 months ago

@LucioFranco I think you should investigate that as part of your embedded replicas work

LucioFranco commented 4 months ago

@Aft1n could you share the version of libsql that you were using?

Aft1n commented 4 months ago

i was using the latest - 7.0

LucioFranco commented 3 months ago

I think your theory sounds correct, I would say if you start to see this issue again you can ping me on discord and I can take a look at what is going on it. This is slightly hard to debug since the malformed error is quite cryptic.

Aft1n commented 3 months ago

So this error happened again, heres an abrupt error from logs:

22:56:11 1|main | error: replication error: Injector error: SQLite error: database disk image is malformed 22:56:11 1|main | at new Database (/app/node_modules/libsql/index.js:75:17) 22:56:11 1|main | at _createClient (/app/node_modules/@libsql/client/lib-esm/sqlite3.js:39:16) 22:56:11 1|main | at /app/src/db/index.ts:7:28 22:56:11 1|main | Bun v1.1.9 (Linux x64)

22:54:47 1|main | 2024-08-02T22:54:47.369381Z ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed 22:54:47 0|main | 2024-08-02T22:54:47.376994Z ERROR libsql::replication: replicator sync error: replication error: Injector error: SQLite error: database disk image is malformed

And i have noticed that my Turso sync usage skyrocketed, because it was having an error and couldnt go through, possibly its currently 5gb/2gb

Aft1n commented 3 months ago

As well as found this one in my fullstack app, that basically is the source for all changes to be sent to turso from.. And this a cron task that failed i assume, or something happend to close the connection. Maybe this could affect it?

error logs
vanillacode314 commented 2 months ago

having the same issue with a deployement on railway

LucioFranco commented 1 month ago

@Aft1n @vanillacode314 are either of you still experiencing this issue? If so could you try to give me a small example of how you're able to trigger it. Unfortunately, just the error message is not very helpful and doesn't really tell us why it was malformed and I am unable to reproduce this.

Aft1n commented 3 weeks ago

In my case its pretty random, i have 2 instances of my app deployed on different servers, and as you can see from the screenshot one works perfectly fine, and another throws this error. Sometimes both produce this error. I need to delete turso replica files in my project and redeploy my app to make it go away.

I have updated to the latest version of libsql - "@libsql/client": "^0.14.0" but the issue still persists. And if im not paying attention my turso Embedded syncs sky rocket, currently at 10gb/3gb

Знімок екрана 2024-10-30 о 10 57 28