Only use aligned buffers to write to block devices.
It means all read streams read into aligned buffers when the target is a block device.
When we don't control the read stream, use a transform that will copy its buffers to aligned buffers.
Each read stream uses a pool of 1 MiB buffers to minimize allocations.
Major change
Only use aligned buffers to write to block devices. It means all read streams read into aligned buffers when the target is a block device. When we don't control the read stream, use a transform that will copy its buffers to aligned buffers. Each read stream uses a pool of 1 MiB buffers to minimize allocations.
This should improve speed and reduce cpu usage.