-
We have an issue https://github.com/NVIDIA/spark-rapids/issues/9058 to enable parquet writes in V2 format. We would like to also test the reader, and test combinations of GPU/CPU encoding and decoding…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars.
### Reprodu…
-
To support loading large tables, that might not fit into memory, it would be a good idea to add an option to `Table.get()` (or another method?), to read the data piece-wise.
Let us first create an …
-
R lesson showing how to use detection extracts in parquet files and why they're better than CSV
-
Tried filtering on a timestamp column using duckdb against lance, and it did not work.
The same query worked against a parquet file.
```python
import pyarrow as pa
import lancedb
import du…
-
Hi,
I thought it might be a good idea to put the lazy Reference parquet files into git. Using this data directly from git is somehow not possible - e.g. our gitlab server also do not allow byte-ran…
-
While trying to read a parquet file from dataset revision refs/convert/parquet (generated by datasets-server) with duckdb, it throws the following error:
```
D select * from 'https://huggingface.…
-
AWS has an option to export an RDS db snapshot to parquet files in an S3 bucket. But the resulting files are gzipped
Is it possible to directly load them with the fdw, or I need to run a batch job …
-
Hello all.
I'm trying to export queried data from a BigQuery database table. Since the resulting table can be large (2.5GB or more), I followed the suggestion "Larger datasets" from the ` bq_table_do…
-
### Description
I'm currently using polars to perform ETL where the final destination is in a data lake, and there's an incompatibility when working with LazyFrames that's causing significant perfo…