More flexibility will be achieved if - instead of thinking of having a set of frontbuffers and backbuffers - we think of having a set of NUM_BUF continuous buffers.
An entire line needs to be stored at once, so a minimum of IMAGE_WIDTH 8 bytes are needed for storage per line buffer.
In the worst-case scenario, MCU_SIZE 2 * ceil(MCU_WIDTH / NUM_BUF) bytes are needed in each buffer. Because of dumb luck, on the iCE40, 1 line fits perfectly into 5 MCUs.
2 line buffers are needed. They don't need to be dual-port.
Some basic notes: