-
## Description
With duckdb, one can read csv, json or parquet files which are accessed through a web server:
ex:
```
SELECT *
FROM read_parquet('https://some.url/some_file.parquet');
```
See …
-
Iceberg tables not compressing parquet file in s3. When the below Table parameters are used for the Compression the file size is increasing in comparison with uncompression. Can some one please assist…
-
Following [documentation](https://huggingface.co/docs/datasets/repository_structure#define-your-splits-and-subsets-in-yaml) I had defined different configs for [`Dataception`](https://huggingface.co/d…
-
**Feature request**
Currently, LSDB has the logic to perform the geometric filtering for cone, polygon, and box search. For the sake of code re-use, move the code to this package instead.
**Befo…
-
At the time, Modin fallback to pandas on parquet files that have [partitioned columns](https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/ExternalTables/UsingPartitions.htm):
```python
imp…
-
### Description
Some parquet files may contain incorrectly calculated statistics (e.g. some of the ones written by older versions of polars containing UInt64 statistics had incorrect min/max). Beca…
-
**Is your feature request related to a ~problem or~ challenge? Please describe what you are trying to do.**
We have been using at least two parquet writers that both utilize the low-level APIs prov…
-
In the (rare) case where multiple input files are uploaded to S3 with the same name, they will overwrite one another. This can occur when processing Parquet files, where each file ends up with a name …
-
Idea is to implement these metadata capabilities from [duckdb](https://duckdb.org/docs/data/parquet/metadata)
I suggest we file a second ticket for implementing parquet_metadata and o…
-
**Describe the issue**:
Crashed when read one index level of a parquet file with `multi-index` using the method `dd.read_parquet`.
df
```
a b
x y
0 1 2 3
```
Error
```
File…