Tencent / Firestorm

Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers
Other
252 stars 72 forks source link

[Minor] Prevent returning incomplete shuffle blocks #183

Closed zuston closed 2 years ago

zuston commented 2 years ago

What changes were proposed in this pull request?

The problem only occurs when wb.getMemoryUsed() > bufferSize and usedBytes.get() - inSendListBytes.get() > spillSize.

Although it doesn't seem to happen because of the logic, i think it should be optimized by using addAll to avoid confusion.

Why are the changes needed?

Better code

Does this PR introduce any user-facing change?

No

How was this patch tested?

No need.

zuston commented 2 years ago

@jerqi Minor fix.

jerqi commented 2 years ago

Could you add the ut for this fix?

jerqi commented 2 years ago

And there is a flaky test that cause CI pipeline fake tests. We have submitted a pr to disable the test temporarily #184 . You can rebase master to avoid the failures which are caused by that flaky test.

zuston commented 2 years ago

I think this bug will not happen, so current test cases are enough.

The condition of wb.getMemoryUsed() > bufferSize and usedBytes.get() - inSendListBytes.get() > spillSize are mutually exclusive.

@jerqi If i'm wrong, please point it. Thanks.

jerqi commented 2 years ago

I think this bug will not happen, so current test cases are enough.

The condition of wb.getMemoryUsed() > bufferSize and usedBytes.get() - inSendListBytes.get() > spillSize are mutually exclusive.

@jerqi If i'm wrong, please point it. Thanks.

Yes, you're right. It will not happen. Please rebase master. If CI passed, I will merge it.