flink-extended / flink-remote-shuffle

Remote Shuffle Service for Flink
Apache License 2.0
191 stars 56 forks source link

[FRS-72] Fix the heartbeat issue caused by zookeeper restart #73

Closed wsry closed 2 years ago

wsry commented 2 years ago

What is the purpose of the change

This resolves #72 . Currently, the shuffle manager may fail to remove a lost shuffle worker if the Zookeeper restart which will cause the change of RPC main thread executor. This patch fixes the issue.

Brief change log

Verifying this change

This change added tests.