-
Hello đź‘Ť
I read your article about CDC and metorikku, great article.
I have a case that I have 200 tables that arrives in parquet format in my datalake, metorikku could process more than one table…
-
**Describe the problem you faced**
A clear and concise description of the problem.
"I use Flink cdc to read MySQL data, and then write it to S3 through hudi. I often encounter checkpoint org.apa…
-
Hudi 12.1
When upsert spark DF with comments metadata, then it is present un the Avro shema commited. Also if enabled it is propagated in HMS.
But spark datasource likely omit them while reading. …
-
### Feature Request / Improvement
XTable 2nd release updates.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's […
-
**Describe the problem you faced**
Currently, we have a pipeline with approximately 2 billion records and 95 columns that runs every day. Yesterday, at the time of execution, there was an intermitt…
-
I am using the path parameter with run_clustering, but I'm encountering an error.
**Expected behaviour**
Clustering should execute successfully.
**Environment Description**
Hudi version : 0.…
-
I recently made a small demo about CDC,
With the Flink CDC capture Mysql data changes and Sink to Hudi, synchronized to the hive.
But when I update, or delete data,
it failure when delete or upda…
-
Env: AWS EMR on EKS 7.1
Hudi version: 0.14.1
Athena engine: v3
Table mode: MOR
**This happens ONLY if I use custom Payload class.**
When RT and RO table aren’t synced, selecting RT table in…
-
Describe the problem you faced
Spark read invalid timestamp(3) data when record in log is older than the same in parquet.
To Reproduce
1. create a mor table with timestamp(3) type.
eg.
…
-
I am running a flink job to writing data into huditable; then i get a error;
```2024-11-14 17:15:50,553 Source: ygs_ods_inner_ord_offer_inst_pay_info -> hoodie_append_write: ygs_ods_inner_ord_off…