duckdb / duckdb_spatial

MIT License
486 stars 39 forks source link

DuckDB Crashes when trying to use r-tree spatial index #405

Closed DrewScatterday closed 1 month ago

DrewScatterday commented 1 month ago

Hi! Thanks for your hardwork on DuckDB absolutely loving it in my day to day spatial data work.

I think this may be be related to #391 but not sure. I think I found a bug with r-tree indexing. I'm using example code from this article https://duckdb.org/docs/extensions/spatial/r-tree_indexes#example

I'm using node.js 20.11.1, Windows 10, and duckdb 1.1.0

If I run this code, it dies with no error thrown:

const duckdb = require("duckdb");

function queryWithCallback(conn, sql) {
    return new Promise((resolve, reject) => {
        conn.all(sql, (err, result) => {
            if (err) {
                reject(err);
            } else {
                resolve(result);
            }
        });
    });
}

async function main() {
    const db = new duckdb.Database(":memory:");
    var sqlQuery = `INSTALL spatial;
                    LOAD spatial;
                    CREATE TABLE t1 AS SELECT point::GEOMETRY as geom
                    FROM st_generatepoints({min_x: 0, min_y: 0, max_x: 100, max_y: 100}::BOX_2D, 10_000, 1337);
                    CREATE INDEX my_idx ON t1 USING RTREE (geom);
                    SELECT count(*) FROM t1 WHERE ST_Within(geom, ST_MakeEnvelope(45, 45, 65, 65));`
    const rows = await queryWithCallback(db, sqlQuery);
    console.log(rows);
    console.log("πŸ¦† DuckDB initialized πŸ¦†");
}

main();

If I remove the r-tree index, it works with no problem:

const duckdb = require("duckdb");

function queryWithCallback(conn, sql) {
    return new Promise((resolve, reject) => {
        conn.all(sql, (err, result) => {
            if (err) {
                reject(err);
            } else {
                resolve(result);
            }
        });
    });
}

async function main() {
    const db = new duckdb.Database(":memory:");
    var sqlQuery = `INSTALL spatial;
                    LOAD spatial;
                    CREATE TABLE t1 AS SELECT point::GEOMETRY as geom
                    FROM st_generatepoints({min_x: 0, min_y: 0, max_x: 100, max_y: 100}::BOX_2D, 10_000, 1337);
                    SELECT count(*) FROM t1 WHERE ST_Within(geom, ST_MakeEnvelope(45, 45, 65, 65));`
    const rows = await queryWithCallback(db, sqlQuery);
    console.log(rows);
    console.log("πŸ¦† DuckDB initialized πŸ¦†");
}

main();

Output:

[ { 'count_star()': 390n } ]
πŸ¦† DuckDB initialized πŸ¦†

I also tried using nightly build:

const duckdb = require("duckdb");

function queryWithCallback(conn, sql) {
    return new Promise((resolve, reject) => {
        conn.all(sql, (err, result) => {
            if (err) {
                reject(err);
            } else {
                resolve(result);
            }
        });
    });
}

async function main() {
    const db = new duckdb.Database(":memory:");
    var sqlQuery = `FORCE INSTALL spatial FROM core_nightly;
                    LOAD spatial;
                    CREATE TABLE t1 AS SELECT point::GEOMETRY as geom
                    FROM st_generatepoints({min_x: 0, min_y: 0, max_x: 100, max_y: 100}::BOX_2D, 10_000, 1337);
                    CREATE INDEX my_idx ON t1 USING RTREE (geom);
                    SELECT count(*) FROM t1 WHERE ST_Within(geom, ST_MakeEnvelope(45, 45, 65, 65));`
    const rows = await queryWithCallback(db, sqlQuery);
    console.log(rows);
    console.log("πŸ¦† DuckDB initialized πŸ¦†");
}

main();

I get this as the error message:

node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^

�☻] {r: Invalid Error: P☺�
  errno: -1,
  code: 'DUCKDB_NODEJS_ERROR',
  errorType: 'Invalid'
}

Node.js v20.11.1

Any ideas, am I missing something?

Maxxen commented 1 month ago

Hi! Thanks for reporting this issue!

My first hunch is that this is related to https://github.com/duckdb/duckdb/issues/13848#issuecomment-2352674000, but the fact that it only crashes when you instantiate the r-tree index is strange... We run the r-tree tests as part of our CI on windows, so maybe it's node related? I'll try to have a look tomorrow.

DrewScatterday commented 1 month ago

Thanks @Maxxen !! Let me know what you find out

I can test on a Ubuntu WSL node js env just to help sanity check

DrewScatterday commented 1 month ago

Code works with no issues on Ubuntu 22.04.02 and same version of node (20.11.1)

So definitely a windows issue and not a rtree bug hmmm

DrewScatterday commented 1 month ago

I tried installing the nightly version of duckdb on winodws with npm install duckdb@next but the cmd window seems to just hang:

[##################] - reify:duckdb: timing reify:audit Completed in 276ms

edit: uninstalled duckdb and then installed with npm install duckdb@next and then still got the same issue

DrewScatterday commented 1 month ago

The duckdb version that gets installed with npm install duckdb@next is "duckdb": "^1.0.1-dev27.0". Was still getting this issue on that version.

I followed the thread in the issue you linked and did the following:

I then still got same issue

I also tried reinstalling spatial extension with FORCE INSTALL spatial FROM core_nightly; but I still get that weird error message I got above

carlopi commented 1 month ago

@DrewScatterday: duckdb-node has been updated, now the DuckDB library is at version v1.1.1. This brings in quite a few changes both DuckDB and spatial extension side, so it's possible that the situation regarding this bug has changed.

Could you possibly try to install https://www.npmjs.com/package/duckdb/v/1.1.1-dev3.0, and re-run the previous test?

Thanks a lot!

DrewScatterday commented 1 month ago

Hey @carlopi thanks for your hard work on duckdb!!

Unfortunately I still get the same issue, let me know if you are able to reproduce. I'm wondering if its something fishy with the spatial extension binary for Windows?

carlopi commented 1 month ago

Thanks for the info, we will dig a bit deeper and get to the bottom of this.

DrewScatterday commented 1 month ago

Any updates on this? are you guys able to reproduce this? @Maxxen @carlopi thank you for your hard work!!

Maxxen commented 1 month ago

Hello! I've been able to reproduce it, but yeah it only crashes on the specific combination of windows+node. windows+python seems fine. I've spent about a day looking into this but debugging node on windows is extremely time consuming as almost any change requires a full-recompilation and the windows machines we have access to are either under-powered or impose a completely different DX to what we're used to. This is of course something we want to fix so I am going to investigate this further but I can't provide an exact timeline.

DrewScatterday commented 1 month ago

Totally understand Windows is a giant pain haha

Keep me posted and if there is anything I can help test let me know

Maxxen commented 1 month ago

Debugging this almost killed me, but I learned a lot about windows and I think I got a fix for this in https://github.com/duckdb/duckdb-node/pull/127 :). Should hopefully make in for DuckDB v1.1.2

DrewScatterday commented 1 month ago

Hahaha thanks @Maxxen you're a legend!!

If it were up to me I wouldn't touch windows with a 10 ft pole but unfortunately the project I'm working on has other dependencies on Windows. Thanks again for looking into this, can I test it on the nightly build?

carlopi commented 1 month ago

@DrewScatterday: a pre-release version of duckdb-node is being built, it should be up in a couple of hours.

Can you possibly then double check whether this works? This will then next week be released as 1.1.2.

Thanks everyone, and amazing bug hunt from @Maxxen

DrewScatterday commented 1 month ago

Yes I'll test as soon as I can on nightly thanks guys!

carlopi commented 1 month ago

Duckdb-node nightly version is here: https://www.npmjs.com/package/duckdb/v/1.1.2-dev4.0 (on duckdb v1.1.1)

DrewScatterday commented 1 month ago

on 1.1.2-dev4.0 the issue is fixed on my side thanks again for your hardwork! πŸš€ πŸ₯³