-
### What happened + What you expected to happen
If we use ray.data.read_parquet to read a column which object is a list, and iterate it with batches. Then it will return batch size number of Numpy …
-
Hi @enrico-mi
I really like the chunked approach used in your implementation, this really makes this dataset scalable.
Contrary to the EasyArgo single parquet file approach, you keep variables in c…
gmaze updated
3 weeks ago
-
Polars reads the Parquet encoded dataset in 800ms whereas Vortex takes 4s.
### reproduction
Assuming you have the PBI parquet dataset downloaded, write a Vortex file:
```python
import vortex
…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/incubator-xtable/issues?q=is%3Aissue) and found no similar issues.
### Please describe the bug 🐞
Team, I hav…
-
How to write a parquet file with the page index enabled just the the pyarrow does?
-
### Terraform CLI Version
1.9.8
### Terraform Provider Version
0.96.0
### Company Name
PGGM
### Terraform Configuration
```terraform
resource "snowflake_file_format" "parquet" {
name = "…
-
When trying to write a parquet file with :decimal types, I receive the following error:
```
; Execution error at tech.v3.libs.parquet/column->field (parquet.clj:886).
; Unsupported datatype for p…
-
### Describe the bug, including details regarding any error messages, version, and platform.
#### Setup
I am using `pyarrow` version `18.0.0`.
I am running my tests on an AWS `r6g.large` instan…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars.
### Reprodu…
-
**Describe the bug**
Creating a recordbatch with arrow map types will have different field names then parquet spec wants. When you write a parquet with datafusion, the parquet spec is simply ignored …