-
**Describe the problem you faced**
1、spark structured streaming : upsert mor (record_index)
2、After compacting, there are a large number of logs with size 0, and they can never be cleared.
**Plea…
-
Using https://github.com/StarRocks/demo/tree/master/documentation-samples/datalakehouse and https://github.com/StarRocks/demo/pull/56
Create huditest bucket.
```
yum install -y python3
rm -f /…
-
**Describe the problem you faced**
Payload changed not work in master branch,which can be work in 0.14.x,currently master can not work,https://github.com/apache/hudi/pull/10857
**Expected be…
-
**Describe the problem you faced**
Running a brand new HoodieStreamer on an empty folder, failing to create metadata table. This is running on a fresh build of the HudiUtilitiesBundle jar off of th…
-
Greetings,
I am currently engaged in developing a community video aimed at illustrating to users the advantages of utilizing DeltaStreamer on AWS Glue as opposed to EMR. AWS Glue, being serverless …
-
**Describe the problem you faced**
Insert overwrite with replacement instant cannot execute archive
![1710641988757.png](https://github.com/apache/hudi/assets/10645422/3bc8867a-e2c6-44f2-96c1-81e3…
-
We were using emr version **emr-6.11.0** , hudi version **0.13.0-amzn-0** , spark version = **3.3.2** , hive version = **3.1.3**
We recently migrated from hudi 0.12.2 to 0.13.0. The main reason for…
-
Now,FlinkSink requires developers to transfer the schema parameter to build DataStream, which means once the schema given,then the TableSchema will be determinded, canot be changed for ever ,but in pr…
-
Using the NYC taxi dataset. I have no idea why it's asking about a ts column since the parquet doesn't have that field.
```
atwong@Albert-CelerData Downloads % parquet-tools inspect green_tripd…
-
## 项目简述
基于 Apache Flink 实现 Kafka + Iceberg 的流批一体混合存储。通过结合消息队列和数据湖的优点,为用户提供逻辑层统一的表视图,可使用 Flink SQL 直接查询和写入,并同时具备毫秒级流式写入和读取、归档数据高效查询分析、存储成本廉价高效、支持row-level更新删除、全增量一体化读取等优点和特性。
## 背景
目前用户在使用 K…