duckdb / duckdb-wasm

WebAssembly version of DuckDB
https://shell.duckdb.org
MIT License
1.02k stars 110 forks source link

computed checksum does not match stored checksum #1690

Open markhalonen opened 2 months ago

markhalonen commented 2 months ago

What happens?

When loading an existing DuckDB file in NodeJS, getting:

Error: IO Error: Corrupt database file: computed checksum 2810343048822349045 does not match stored checksum 11515148827668975023 in block
    at rt.runQuery (/REDACTED/node_modules/.pnpm/@duckdb+duckdb-wasm@1.28.1-dev173.0/node_modules/@duckdb/duckdb-wasm/dist/duckdb-node-eh.worker.cjs:80:80188)

The issue seems to be on writing the DB. Can't load the DB from duckdb cli binary either.

To Reproduce

I'm running duckdb-wasm from within node because I couldn't get duckdb-node to build for an alpine docker container. My goal is to update an existing DuckDB file.

Similar goal to https://github.com/duckdb/duckdb-wasm/issues/1119, but hitting this error.

Create db1:

create_db1.js:

const duckdb = require('@duckdb/duckdb-wasm')
const path = require('path')
const Worker = require('web-worker')
const DUCKDB_DIST = path.dirname(require.resolve('@duckdb/duckdb-wasm'))
const fs = require('fs')

;(async () => {
  try {
    const DUCKDB_CONFIG = await duckdb.selectBundle({
      mvp: {
        mainModule: path.resolve(DUCKDB_DIST, './duckdb-mvp.wasm'),
        mainWorker: path.resolve(DUCKDB_DIST, './duckdb-node-mvp.worker.cjs'),
      },
      eh: {
        mainModule: path.resolve(DUCKDB_DIST, './duckdb-eh.wasm'),
        mainWorker: path.resolve(DUCKDB_DIST, './duckdb-node-eh.worker.cjs'),
      },
    })

    const logger = new duckdb.ConsoleLogger()
    const worker = new Worker(DUCKDB_CONFIG.mainWorker)
    const db = new duckdb.AsyncDuckDB(logger, worker)
    await db.instantiate(DUCKDB_CONFIG.mainModule, DUCKDB_CONFIG.pthreadWorker)

    db.open({
      path: 'db1.duckdb',
      accessMode: duckdb.DuckDBAccessMode.READ_WRITE,
    })

    const conn = await db.connect()

    await conn.query(`CREATE TABLE t1(c1 integer PRIMARY KEY, asd text, r float);`)

    await conn.query(
      `insert into t1(c1, asd, r) SELECT c1, 'asdasdasdasd' as asd, random() as r FROM generate_series(1, 1000000) t(c1)`
    )

    await conn.close()
    await db.dropFiles()
    await db.terminate()
    await worker.terminate()
  } catch (e) {
    console.error(e)
  }
})()

node create_db1.js

then try to load db1 as db2 and make some changes db1_to_db2.js:

const duckdb = require('@duckdb/duckdb-wasm')
const path = require('path')
const Worker = require('web-worker')
const DUCKDB_DIST = path.dirname(require.resolve('@duckdb/duckdb-wasm'))
const fs = require('fs')

;(async () => {
  try {
    const DUCKDB_CONFIG = await duckdb.selectBundle({
      mvp: {
        mainModule: path.resolve(DUCKDB_DIST, './duckdb-mvp.wasm'),
        mainWorker: path.resolve(DUCKDB_DIST, './duckdb-node-mvp.worker.cjs'),
      },
      eh: {
        mainModule: path.resolve(DUCKDB_DIST, './duckdb-eh.wasm'),
        mainWorker: path.resolve(DUCKDB_DIST, './duckdb-node-eh.worker.cjs'),
      },
    })

    const logger = new duckdb.ConsoleLogger()
    const worker = new Worker(DUCKDB_CONFIG.mainWorker)
    const db = new duckdb.AsyncDuckDB(logger, worker)
    await db.instantiate(DUCKDB_CONFIG.mainModule, DUCKDB_CONFIG.pthreadWorker)
    await db.registerFileURL('db1.duckdb', `db1.duckdb`, duckdb.DuckDBDataProtocol.NODE_FS, true)

    db.open({
      path: 'db2.duckdb',
      accessMode: duckdb.DuckDBAccessMode.READ_WRITE,
    })

    const conn = await db.connect()
    const res = await conn.query(`attach 'db1.duckdb' (READ_WRITE)`)

    await conn.query(`copy from database db1 to db2`)

    await conn.query(`CREATE TABLE t4 as SELECT c1, 'asdasdasdasd' as asd FROM generate_series(1, 1000000) t(c1)`)

    await conn.close()
    await db.dropFiles()
    await db.terminate()
    await worker.terminate()
  } catch (e) {
    console.error(e)
  }
})()

node db1_to_db2.js

IMPORTANT:

it works if I remove the PRIMARY KEY constraint.

Browser/Environment:

NodeJS v20.11.1

Device:

Macbook M3 Pro

DuckDB-Wasm Version:

1.28.1-dev173.0

DuckDB-Wasm Deployment:

from npm

Full Name:

Mark Halonen

Affiliation:

gosteelhead.com