-
**Describe the bug**
arrow-rs generated .parquet files where the schema implies a nested structure should call the list item `element` as of parquet specifications:
https://github.com/apache/parquet…
-
**Parquet Viewer Version**
3.1.0, also tried/used 2.8.
**Where was the parquet file created?**
Apache Spark
org.apache.spark.timeZone GMT org.apache.spark.legacyINT96 org.apache.spark.version 3.…
-
### Apache Iceberg version
1.4.3
### Query engine
Spark
### Please describe the bug 🐞
```
CALL spark_catalog.system.rewrite_data_files(
table => '${DATABASE_NAME}.${TABLE_NAME}'…
-
When I read in a parquet dataset saved with Spark on a databricks catalog I get lots of .
I tried
```
import glob
cudf_dfs = [cudf.read_parquet(file) for file in glob.glob("/Volumes/path/*.parquet…
-
I'm encountering the following error while trying to load some parquet files (using docker latest):
```log
Initiating shutdown due to: Uncaught exception in thread main
java.lang.RuntimeException…
-
### Describe the bug
I'm trying to load an SQLite database that's around 100MB.
Seems like I'm hitting this line when trying to access a table in the db that's bigger than 32MB:
https://github.com/e…
-
Currently, all of these tests utilize randomness to generate page data for decoder verification. This can introduce test flakiness.
We should re-write these tests to use a deterministic set of valu…
-
## Issue Description
- Description of the issue:
When exporting a MSSQL table to parquet I get a parquet file where DuckDB complains about string encoding issues.
"select * from output.parquet…
-
### What happens?
When using `COPY ... TO ...` to generate a new parquet file from another parquet file the `ROW_GROUP_SIZE` parameter doesn't work.
The final row group size is very low (under 1…
-
### What happens?
Recently, we tried this extensions instead of using a standalone duckdb instance. When we run a simple `SELECT` query on parquet files, it's 2-20 times slower than DuckDB.
Profil…