flink-extended / flink-remote-shuffle

Remote Shuffle Service for Flink
Apache License 2.0
191 stars 57 forks source link

Introduce a simple disk health checker #28

Closed wsry closed 2 years ago

wsry commented 2 years ago

Motivation

If a disk is unhealthy, we need to remove it automatically, otherwise, the job can fail repeatedly and never succeed.

Changes

Implement a simple disk health checker which simply checks whether new file can be created and is readable and writable. If not, remove the unhealthy disk automatically.

Test

wsry commented 2 years ago

Resolved.