-
### Search before asking
- [X] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar.
### Motivation
### Basic Information
table
```
{
"ver…
-
**Affected module**
Ingestion Framework
**Describe the bug**
Failed to run S3 storage metadata ingestion due _SUCCESS file in dataPath entries folder.
**To Reproduce**
openmetadata.json
```…
-
## tl;dr
The file generated by the following can be read back in with Zed tools but not some other tools.
```
$ echo '{device_floor: 3 (uint8)}' | zq -f parquet -o repro-uint8.parquet -
```
…
-
The following statement works in the CLI (duckdb v 1.1.3):
```sql
create secret (
type S3,
endpoint 'my.objectstorage.minio.org',
use_ssl 'true',
url_style 'path',
key_id 'some_s3_user',…
-
While investigating some NDS queries, noticed that sometimes Parquet scans are producing more batches than required for the target batch size. For example, this screenshot of a query plan snippet fro…
jlowe updated
3 weeks ago
-
Since we have a separation between the logical plan and execution, we can allow querying arbitrary data sources using PromQL.
One such data source could be Apache Parquet files, similar to how http…
-
Name mapping is used when the files in the table don't have field-IDs encoded in the Parquet files. For example, when adding files through `add_files` in the case of a table migration from Hive, the P…
-
We need to set up the parquet export job on GCP properly. This includes a cron job to run it every night. The set up can be similar to our Sourcify Cloud Run job.
We can already start to run the st…
-
From issue https://github.com/apache/arrow/issues/14229.
The bug looks like this:
- create a pandas dataframe with **one column** and `n` rows, `n < max(int32)`
- each elemenet is a list with `m` …
-
I’m working on a tool that generates Parquet files based on a file definition provided in JSON. I use the parquet-java library for this, and I’m curious if it’s possible to specify a particular type o…