duckdb / duckdb-wasm

WebAssembly version of DuckDB
https://shell.duckdb.org
MIT License
1.26k stars 129 forks source link

registerFileHandle and spatial extension? #1793

Closed nshiab closed 1 month ago

nshiab commented 3 months ago

Hello!

You have great examples showing how to load data from a File handle here: https://duckdb.org/docs/api/wasm/data_ingestion

But this doesn't seem to work with the spatial extension. Here's what I tried to do:

// file is picked by the user with the File API
await db.registerFileHandle(
  file.name,
  file,
  DuckDBDataProtocol.BROWSER_FILEREADER,
  true
);

await c.query(
  `INSTALL spatial; LOAD spatial;
  CREATE OR REPLACE TABLE table1 AS SELECT * FROM ST_Read(`${file.name}`)`
);

But I get this error:

IO Error: Unknown file type

Thank you! 🙏

Originally posted by @nshiab in https://github.com/duckdb/duckdb-wasm/discussions/1791

yfyf commented 3 months ago

I have the same issue with registerFileText, see here: https://github.com/duckdb/duckdb-wasm/discussions/1791#discussioncomment-10082002

malveo commented 2 months ago

Hello everyone,

I want to confirm that the issue described is reproducible and occurs 100% of the time under the specified conditions.

You can find a detailed example to validate this issue here:

https://github.com/malveo/duckdb-wasm-gdal-xlsx https://github.com/malveo/duckdb-wasm-gdal-xlsx/blob/main/src/index.ts

When using the st_read function, the issue arises when the file is passed locally. However, this issue only occurs when the file is read via HTTP.

https://github.com/malveo/duckdb-wasm-gdal-xlsx/blob/main/public/vanilla/index.html

Thank you! 🙏

Best malveo

AdeelK93 commented 2 months ago

@malveo your vanilla example inspired me to come up with this workaround:

const blobUrl = URL.createObjectURL(file)
const fakeUrl = window.location.href + "data.xlsx"
await db.registerFileURL(fakeUrl, blobUrl, DuckDBDataProtocol.HTTP, true)
const read_xlsx = await c.query(
    `CREATE OR REPLACE TABLE table1 AS SELECT * FROM ST_Read('${fakeUrl}');`
);
URL.revokeObjectURL(blobUrl)
nshiab commented 2 months ago

Thank you @AdeelK93, but it doesn't seem to work in my context. Does it work for you @malveo ?

AdeelK93 commented 2 months ago

@nshiab what's the error you're getting? and, are you on the latest duckdb? 1.28.1-dev258.0

the registerFileURL approach is working reliably for me

ericemc3 commented 2 months ago

as of 1.28.1-dev258.0, it seems to work fine with registerFileHandle()

AdeelK93 commented 2 months ago

as of 1.28.1-dev258.0, it seems to work fine with registerFileHandle()

@ericemc3 I'm able to reproduce the bug in 1.28.1-dev258.0 with registerFileHandle, are you doing something different from malveo? Or did you mean registerFileURL?

ericemc3 commented 2 months ago

as of 1.28.1-dev258.0, it seems to work fine with registerFileHandle()

@ericemc3 I'm able to reproduce the bug in 1.28.1-dev258.0 with registerFileHandle, are you doing something different from malveo? Or did you mean registerFileURL?

https://observablehq.com/d/4cc96a19f57a830d

nshiab commented 1 month ago

Works like a charm with @1.28.1-dev258.0. Thanks, @ericemc3! I was still using 1.28.1-dev106.0, since it's tagged as the latest.

malveo commented 1 month ago

Hi @nshiab and @ericemc3,

Thanks for the support!

I just tested with release -dev258.0 and followed @ericemc3’s suggestion from https://observablehq.com/d/4cc96a19f57a830d, using registerFileURL with an XLSX file, and I encountered the following error:

Error: IO Error: GDAL Error (4): `file.xlsx` not recognized as a supported file format.
    at O.onMessage (@duckdb_duckdb-wasm.js?v=aa089efd:12032:15)

check it out here:

https://github.com/malveo/duckdb-wasm-gdal-xlsx

Am I missing something?

nshiab commented 1 month ago

Hi @malveo! I'm sorry. I don't know why, but it's probably worth opening a new issue. Good luck!