flink-extended / flink-remote-shuffle

Remote Shuffle Service for Flink
Apache License 2.0
191 stars 57 forks source link

Fix the bug that BaseMapPartition may not read data sequentially #79

Closed TanYuxin-tyx closed 2 years ago

TanYuxin-tyx commented 2 years ago

When reading data from MapPartition files, It is common that some subpartitions are requested before others and their region indexes are ahead of others. If all region data of a subpartition can be read in one round, some subpartition readers will always ahead of others which will cause random IO. This patch fixes this case by polling one subpartition reader at a time.