-
### What happens?
Impala (CDH 7.1.9) has sometimes issues to read parquet files (which contain null values) generated by duckdb.
```bash
Parquet file '.../test.parquet': metadata is corrupt. Dicti…
-
## Describe the new feature
We need a way to determine if the bloom filter is applied or not on a parquet file when inspecting the parquet metadata with ClickHouse via `SELECT * FROM file('output.par…
-
**Describe the issue**:
The way timedelta values (a.k.a. durations, intervals...) are stored in parquet does not follow the file format specification. According to the [parquet specification](h…
mgab updated
10 hours ago
-
**Problem description**
Session logs are stored in parquet files. The can be red at the nodes using: `parquet-tools show state.parquet`. This is a hard to remember command. We should consider to add:…
-
Some personal data columns need to be masked instead of being pruned(Parquet-1791). We need a tool to replace the raw data columns with masked value. The masked value could be hash, null, redact etc. …
-
I have code that writes a structure using a GenericWriter. Using the default buffer size occasionally resulted in parquet files that would cause a panic when read in (I used other tools to attempt rea…
-
### Describe the enhancement requested
Currently we're using debug tools to analysis, I found that the printJson might not print a valid json message
This is ok for human-reading, but making analy…
-
### Apache Iceberg version
main (development)
### Query engine
Spark
### Please describe the bug 🐞
The bug is present in Iceberg 1.2 and later (and is in main).
A customer uses Impala …
-
### Bug description
result is different from parquet-tools output:
```
lniu@devrestricted-lniu:~/velox_parquet_test_triage/fail_parquet_files/testStructOfTwoArrays$ parquet-tools show native_parque…
-
Im trying to update the parquet-tools with the changes after the `Delayed dictionary` (#160) PR.
Im using the `read::decompress` command to extract the page and then Im using this function to deco…