vlcn-io / js

Components to build apps in JavaScript atop cr-sqlite
MIT License
54 stars 6 forks source link

Malformed DB using fts5 #31

Closed theboolean closed 10 months ago

theboolean commented 12 months ago

Hi! I think I found a blocking bug for @vlcn.io/crsqlite-wasm when using fts5.

Basically after some INSERT I receive this error and the db becomes unusable:

Error: database disk image is malformed
    at check (sqlite-api.js:866:1)
    at Object.step (sqlite-api.js:699:1)
    at async TX.statements (TX.ts:152:1)

I saw this can be mitigated by wrapping the INSERT into a transaction, but still if you perform lot of transactions this happens again.

Minimal code to reproduce:

    const sqlite = await initWasm()
    const db = await sqlite.open('sqlite_test.db')
    await db.exec('CREATE VIRTUAL TABLE IF NOT EXISTS test_fts USING fts5(fieldA, fieldB, fieldC, fieldD, fieldE, id UNINDEXED, prefix=2, detail=none);')

    for (let i = 0; i < 1000; i++) {
      await db.exec(`INSERT INTO test_fts (fieldA, fieldB, fieldC, fieldD, fieldE, id) VALUES ('test', 'test', 'test', 'test', 'test', '${i}');`)
        .catch(err => {
          console.error(err, i)
        })
    }

to me it always fail during iteration 877.

I also created a basic create-react-app project that demonstrates this, code lives here.

theboolean commented 12 months ago

Forgot to say: tested in Chrome and Firefox on Mac OS

tantaman commented 12 months ago

Thanks for the report. It may be a problematic interaction with the indexeddb vfs. I should release a variant of cr-sqlite that lets the user choose their virtual filesystem so we can track these things down.

theboolean commented 12 months ago

FYI I tried to set the DB in memory and the error didn't occur, so you're right: it must be something related to indexeddb. Did you manage to reproduce the bug on your side?

AlexErrant commented 12 months ago

I found a blocking bug

In the interest of getting you unblocked, have you tried inserting multiple values at a time, e.g.

INSERT INTO test_fts (
  fieldA, fieldB, fieldC, fieldD, fieldE, 
  id
) 
VALUES 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  ), 
  (
    'test', 'test', 'test', 'test', 'test', 
    '${i}'
  );

I do this in my own project here in batches of 1000 and was able to add ~35k rows without database disk image is malformed (I really should use a transaction though). Schema here - I only have 3 FTS columns though compared to your example's 5. If you want to test my code I can upload those rows somewhere.

theboolean commented 11 months ago

Unfortunately, this is not a workaround we can implement with our use case. We perform lot of updates after the first db population, and eventually it will corrupt

tantaman commented 11 months ago

@theboolean - would switching to OPFS be acceptable for your use case? I think the OPFS VFS should handle this fine since it is less temperamental than indexeddb. I'll need to expose OPFS for you -- it currently isn't exposed by the cr-sqlite wrappers over wa-sqlite.

@rhashimoto - just an fyi. I assume this is in wa-sqlite itself but haven't stripped out cr-sqlite layers to confirm.

rhashimoto commented 11 months ago

Will take a look, but I've never used FTS (it's not in the wa-sqlite default build) and I'm deep into something else right now.

rhashimoto commented 11 months ago

@theboolean If you run your test with Dev Tools open is there anything interesting printed to the console when it fails?

theboolean commented 11 months ago

@tantaman I would like to use indexeddb because of wider browser support, but I can give OPFS a try to compare performances.

@rhashimoto the only error I receive is the one I pasted in the OP, but I added the repo that can reproduce the issue to codesandbox, so you can try by yourself.

Thank you guys for the interest in this issue 🙏🏻

theboolean commented 11 months ago

@tantaman is exposing OPFS a simple change to do?

tantaman commented 11 months ago

@theboolean - should be pretty straightforward. Would be a matter of exposing an option for the user to supply a VFS class to initWasm

See these lines:

https://github.com/vlcn-io/js/blob/705f6d2409d84c58c7953276e6b9102c2c08501b/packages/crsqlite-wasm/src/index.ts#L55-L57

https://github.com/vlcn-io/js/blob/705f6d2409d84c58c7953276e6b9102c2c08501b/packages/crsqlite-wasm/src/index.ts#L39-L57

rhashimoto commented 11 months ago

Reproduced in wa-sqlite. It fails with SQLITE_CORRUPT on iteration 877, exactly as reported.

rhashimoto commented 11 months ago

It's a bug in IDBBatchAtomicVFS. It also reveals a likely bug in SQLite, without which it probably wouldn't have shown up here. It's still a critical bug in IDBBatchAtomicVFS, though, as I think it can be exposed in other ways. I hope to merge the fix tomorrow-ish.

tantaman commented 11 months ago

Appreciate it.

@theboolean - I'll roll a new release to update to https://github.com/rhashimoto/wa-sqlite/releases/tag/v0.9.8 which will resolve this issue.

rhashimoto commented 11 months ago

@theboolean What a great bug report, including a compact reproduction! This exposed not one but two quite subtle bugs, one in SQLite which is possibly the best tested library on the planet (and one in wa-sqlite which is...not). The bug was fixed in SQLite only hours after submission, and was considered important enough to bypass the code freeze for their imminent release.

tantaman commented 11 months ago

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

theboolean commented 11 months ago

@tantaman I think it makes sense to merge wa-sqlite v0.9.8 as a patch and then releasing v0.16. What do you think?

iansinnott commented 11 months ago

@theboolean, @iansinnott - do you guys need a fix immediately or can you wait till the v0.16 release?

@tantaman no rush, thanks for looking into this

tantaman commented 11 months ago

This'll fix it -- https://github.com/vlcn-io/js/pull/36

but I'll need to publish a new release for everyone.

theboolean commented 10 months ago

Hi @tantaman! Touching base just to check if there are any updates on this :)

tantaman commented 10 months ago

@theboolean - I'll be publishing a new release today. Luckily it'll be compatible with the current release as we decided to revert some breaking changes.

tantaman commented 10 months ago

re-opening until the release is actually published.

tantaman commented 10 months ago

Almost there. Need to get this pulled in -- https://github.com/vlcn-io/js/pull/38

tantaman commented 10 months ago

New release published. You should be all set. Note that I've published these under the next tag until I've had a chance to roll them into the various examples and demo apps to further verify the builds.

theboolean commented 10 months ago

Thank you @tantaman, @rhashimoto, and all the people involved in fixing this 🙏🏻

tantaman commented 10 months ago

v0.16.0-next.2 is pretty solid at this point. There were a few bugs in v0.16.0-next.0 that are now fixed.

Will be making next.2 the stable release soon.