apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.43k stars 954 forks source link

[Bug] unaware bucket mode does not support multiple writers writing to the same kafka partition when using kafka as logSystem. #4515

Closed liming30 closed 1 week ago

liming30 commented 1 week ago

Search before asking

Paimon version

paimon-1.0-snapshot

Compute Engine

flink-1.17

Minimal reproduce step

  1. create an append-only table and use kafka as the logsystem.
  2. use flink to write to the table and set the write parallelism to greater than 1.

What doesn't meet your expectations?

Caused by: java.lang.RuntimeException: bucket-0 appears multiple times, which is not possible.
    at org.apache.paimon.manifest.ManifestCommittable.addLogOffset(ManifestCommittable.java:68)
    at org.apache.paimon.flink.sink.StoreCommitter.combine(StoreCommitter.java:97)
    at org.apache.paimon.flink.sink.StoreCommitter.combine(StoreCommitter.java:79)
    at org.apache.paimon.flink.sink.StoreCommitter.combine(StoreCommitter.java:42)

Anything else?

No response

Are you willing to submit a PR?