-
gatk forum users have started requesting support for MacOS and Linux arm64 architectures - see f[orum post](https://gatk.broadinstitute.org/hc/en-us/community/posts/5462468688539-Is-ARM64-Linux-MacOS-…
-
-
### System Info
0.0.234 MacOS Big Sur
### Who can help?
_No response_
### Information
- [ ] The official example notebooks/scripts
- [ ] My own modified scripts
### Related Components
- [ ] LLM…
-
# Parquet Format
*Provide the standard metadata for the proposed format, ensuring that the id and name are unique and appropriate to the version of the format being proposed.*
- formatId: `appli…
-
### Motivation
Parquet files work w/ partitioning, but you need to have a way about it. As we work towards building an incremental pipeline, we need an approach to append records rather than what we h…
-
- Build storage prototype that repurposes the LSM /DRAM spine implementation and stores columnar data in a custom rkyv friendly format which allows for 0-copy deserialization
- Evaluate it by running…
-
I have around 40 K columns in a spark dataframe lot of them have null values. While storing the data to iceberg table it takes lot of time though the final data in parquet in 20-25 Mbs. Is the sparse …
-
We use pandas-gbq a lot for our daily analyses. It is known that memory consumption can be a pain, see e.g. https://www.dataquest.io/blog/pandas-big-data/
I have started to write a patch, which cou…
-
An `Array` is a dictionary of attributes and an ndarray. It is written to disk as an HDF5 `dataset` and attributes. A `Table` is a collection of `Array` where each `Array` is written separately to dis…
-
**Reporter**: [Uwe Korn](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=uwe) / @xhochy
#### Related issues:
- [Some logical types not supported when loading Parquet](https://issues.apac…