Broker becomes leader after restart before STM replay has finished

Version & Environment

Redpanda version: dev

What went wrong?

When restarting Redpanda with lots of unflushed partitions (because of running with acks=1 or no existing snapshot) it looks like RP becomes leader before actually finishing STM replay for all partitions.

This is problematic as the replay can put strain on the system while at the same time the broker is already serving clients again. This might overload the broker (cpu or disk) and then cause instabilities (leadership ping-pong).

What should have happened instead?

Only become leader once in fully recovered state again.

How to reproduce the issue?

Use a weak disk (for example EBS with limited IOPS & throughput)
Create a big unflushed log
Restart RP

Additional information

Below are some metrics from the scenario occuring. The gap is the node being shutdown. After restart we see the node very quickly gaining leadership again while there is still ongoing disk reads for about 20 minutes. During that period the cluster is showing instability and constant leadership changes. Once reads have finished the cluster becomes stable again.

JIRA Link: CORE-1455

redpanda-data / redpanda