-
The `GraphicWalker` is quite powerful. Please add a CLI command for quickly exploring a `.csv` or `.parquet` file.
```bash
gw show my_data.parquet
```
Should then start a server on a port. The…
-
I have started to evaluate both your parquet-tools and xitongsys for us to potentially participate in the open source project and helping with ongoing improvement. One issue I have found that does imp…
-
### What happens?
A result of a query is written to parquet file. A parquet file is read, transformations are applied and the result is written to another parquet file. The process runs with 8 thre…
-
Hi,
I just noticed that, in the linux version under the docker, xic folder wasn't generated with --xic argument and found out that the path was set to start with "/" so all xic parquet files gets s…
-
**Summary**
Interval is a value type that Databend understands as it is used in date addition. However, there is no current way to store an Interval value like can be done in Postgres.
While Sn…
-
## Description
- I would like to save partitioned Polars parquet datasets which currently relies on Pyarrow using write_parquet
- Following documentation: [https://docs.pola.rs/api/python/version/…
-
DuckDB's HTTPFS feature, which can read parquet, csv, json, and other files on HTTP servers or cloud object storage, is an incredibly powerful tool that allows the query engine to use range reads to p…
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars.
### Re…
jtanx updated
1 month ago
-
### Describe the bug
I was doing https://github.com/apache/datafusion/pull/13054 and after rebase main I find out that the current parquet predicate pushdown may have some problem, it is 7 times slow…
-
### dlt version
1.3.0
### Describe the problem
I have a pipeline that copies a table from sql server to azure gen2 storage. It creates delta files and works fine if the parquet files are small howe…