Closed cornhundred closed 2 months ago
when I run await pq.default()
, I see this error
widget.js:237 TypeError: Failed to construct 'URL': Invalid URL at UP (542aeba4-c71d-47a1-8e67-2dcff8e6ca23:1553:14046) at Object.oz [as render] (542aeba4-c71d-47a1-8e67-2dcff8e6ca23:1560:1458) at async widget.js:363:17
If you look at the generated bindings, you can see
if (typeof input === 'undefined') {
input = new URL('parquet_wasm_bg.wasm', import.meta.url);
}
in the __wbg_init
function exported at the very end of the file. Presumably, your import.meta.url
is not set correctly, so that the new URL
constructor fails.
and clicking LC (8a312991-81c6-426e-b14c-067bcbe5f62b:1553:10554) shows
async function LC(j, A) { if (typeof j == "string") { j.startsWith("./") && (j = new URL(j,import.meta.url).href); let t = await fetch(j); if (typeof WebAssembly.instantiateStreaming == "function") try { return await WebAssembly.instantiateStreaming(t, A) } catch (e) { if (t.headers.get("Content-Type") != "application/wasm") console.warn(e); else throw e } j = await t.arrayBuffer() } return await WebAssembly.instantiate(j, A) }
I can't find this function in the generated bindings in the latest bundler build. You should try one the latest beta.
Also, the
await pq.default()
function works properly if use a CDN to obtain parquet-wasm like this
import * as pq from "https://unpkg.com/parquet-wasm@0.4.0-beta.5/esm/arrow2.js";
So presumably import.meta.url
isn't defined in Jupyter or something like that.
Thanks @kylebarron, I was able to get it to work in the following way and would appreciate any advice:
I am using the 0.4.0-beta.5
version of parquet-wasm because I haven't migrated to the new API yet, so my dependencies in my package.json look like this:
"dependencies": {
"deck.gl": "^9.0.5",
"parquet-wasm": "0.4.0-beta.5",
"apache-arrow": "15.0.2",
"math.gl": "2.3.3",
"@loaders.gl/core": "4.1.1"
},
Since parquet-wasm was working correctly with file that was obtained from unpkg, I figured I would download the file (https://unpkg.com/parquet-wasm@0.4.0-beta.5/esm/arrow2.js), save it locally to /vendor/parquet-wasm/parquet-wasm_unpkg.js
(along with the project licenses), and import it like this:
import * as pq from "./vendor/parquet-wasm/parquet-wasm_unpkg.js";
...
I was still getting the URL error so I added a console log to the parquet-wasm_unpkg.js
file to log the import.meta.url, which ends up being the localhost that is hosting Jupyter. On my MacBook I was able to use change the URL to this 'files/js/vendor/parquet-wasm/arrow2_bg.wasm'
and it was able load the file and run without error - see below:
async function init(input) {
// console.log('here in the parquet-wasm source code');
// Use a fixed path for development. You may need to adjust this path based on your project's structure and where it's served from.
// For example, if your server serves the `vendor` directory at the root, and `arrow2_bg.wasm` is within `vendor/parquet-wasm/`,
// the path should reflect that.
const fixedPath = 'files/js/vendor/parquet-wasm/arrow2_bg.wasm'; // Adjust this path as necessary.
// js/vendor/parquet-wasm
if (typeof input === 'undefined') {
// Assume we're in a browser environment and construct the URL relative to the server's root.
input = new URL(fixedPath, window.location.origin);
}
// console.log('WASM module will be loaded from:', input);
const imports = getImports();
if (typeof input === 'string' || (typeof Request === 'function' && input instanceof Request) || (typeof URL === 'function' && input instanceof URL)) {
input = fetch(input);
}
initMemory(imports);
const { instance, module } = await load(await input, imports);
return finalizeInit(instance, module);
}
However, this did not work on Google Colab and Terra.bio - probably because we can't rely on Jupyter hosting files. So I figured I would try to hardwire the WASM file into the JavaScript by converting it to a Base64 string. I saved this string to a file called wasmModuleBase464.js
that looks like this:
export const wasmBase64 = `AGFzbQEAAAAB5 ...
and imported it into the init function on my local copy of parquet-wasm_unpkg.js
import { wasmBase64 } from './wasmModuleBase64.js';
async function init(input) {
// No need to adjust the path, as we'll be loading the WASM from a Base64 string
const imports = getImports();
// Decode the Base64 string to get the binary representation
const binaryString = window.atob(wasmBase64);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
initMemory(imports);
// Use the binary bytes to instantiate the WebAssembly module
const { instance, module } = await WebAssembly.instantiate(bytes, imports);
return finalizeInit(instance, module);
}
This approach seems to be working locally and on Google Colab and Terra.bio. Do you think this is a reasonable approach? If so, would it make sense to include the WASM code as a base64 string in the esm version?
Hi, I am seeing errors when I try to import parquet-wasm using the bundler esbuild.
Similar to this issue, https://github.com/kylebarron/parquet-wasm/issues/486, I am seeing this error
when I import parquet-wasm like this
without awaiting the default function. However, when I run
await pq.default()
, I see this errorIf I try to switch to using the bundler build like this
(which required using the wasmLoader for esbuild and setting the target for esnext to enable top-level await) I get this error
and clicking LC (8a312991-81c6-426e-b14c-067bcbe5f62b:1553:10554) shows
For some background, I'm using parquet-wasm in an anywidget that is being bundled with esbuild on the suggestion from this discussion. Also, the
await pq.default()
function works properly if use a CDN to obtain parquet-wasm like thisimport * as pq from "https://unpkg.com/parquet-wasm@0.4.0-beta.5/esm/arrow2.js";