It's regarding KAFKA-17515.
Found two issues in the flaky tests: (Put the log analysis under Jira comments.)
The error "java.nio.file.DirectoryNotEmptyException" occurs if the flush() of kafkaStreams.close() and purgeLocalStreamsState() are triggered in the same time. (The current timeout is 5 sec, which is too short since the CI is unstable and slow).
Racing issue: Task to-be restored in ks-1 are rebalanced to ks-2 before entering active restoring state. So no onRestoreSuspend() was triggered.
To solve the issues:
Remove the timeout in kafkaStreams.close()
Ensure all tasks in ks-1 are active restoring before start second KafkaStreams(ks-2)
Committer Checklist (excluded from commit message)
It's regarding KAFKA-17515. Found two issues in the flaky tests: (Put the log analysis under Jira comments.)
ks-1
are rebalanced toks-2
before entering active restoring state. So no onRestoreSuspend() was triggered.To solve the issues:
ks-1
are active restoring before start second KafkaStreams(ks-2
)Committer Checklist (excluded from commit message)