adamviola / parquet-explorer

Explore Parquet files with SQL
MIT License
16 stars 1 forks source link

Issue with LargeUtf8 data type #2

Closed snestler closed 2 months ago

snestler commented 7 months ago

I am able to view some .parquet files in VS Code using parquet-explorer. But it seems to have an issue with others. Here is the error message I am getting:

{"error":"while reading {path-to-file\filename.parquet}: Error: Unrecognized type: \"LargeUtf8\" (20)"}

Any suggestions? Thank you.

adamviola commented 2 months ago

We use DuckDB to read the Parquet file, and I believe DuckDB uses Apache Arrow to read the Parquet file.

Apache Arrow actually only recently added JavaScript support for reading LargeUtf8! Looks like it happened in version v15.0 (comparing the implementation status page of v14.0 and v15.0)

However, it looks like DuckDB is using an old version of Arrow v9.0.0.

Closing for now. I'll keep an eye on the version of Arrow used by future DuckDB versions.