ceph / dmclock

Code that implements the dmclock distributed quality of service algorithm. See "mClock: Handling Throughput Variability for Hypervisor IO Scheduling" by Gulati, Merchant, and Varman.
Other
89 stars 55 forks source link

server: fix run_sched_ahead() scheduling issue #84

Closed yongseokoh closed 3 years ago

yongseokoh commented 4 years ago

The first issue is that the run_sched_ahead() function is blocked when the push priority queue is employed. It is confirmed that the sched_ahead_cv.wait() isn't signaled if the predicate function returns false with this->finishing variable. For this reason, sched_ahead_when > TimeZero condition is added pred() function.

The other is that the further request is not fetched when the schedule_request() executes the submit_request() and sched_ahead_when is set to TimeZero. It implies that run_sched_ahead thread can't wake up by itself. This commit adds that the schedule_request() is called while requests are ready and sched_ahead_when is equal to TimeZero. A unit test has also been added to check whether 3 requests are dispatched after a few seconds.

Fixes: https://tracker.ceph.com/issues/48123 Fixes: https://tracker.ceph.com/issues/48317 Signed-off-by: Yongseok Oh yongseok.oh@linecorp.com

yongseokoh commented 3 years ago

@cbodley @ivancich Could you please take a look at this? We have been actively working on the MDS QoS RFC based on the push priority queue. However, it sometimes is not working properly with this issue. We would like to confirm whether this PR needs to be considered.

ivancich commented 3 years ago

@yongseokoh I will try to review this in the next day or so. It's been a while since I've been through this code, so I need to invest some time in it. Thanks for your contributions!

athanatos commented 3 years ago

@ivancich Have you had a chance to take a look?

yongseokoh commented 3 years ago

@ivancich Could you please take a look at this? I have made some efforts to this project based on @athanatos' review.

tchaikov commented 3 years ago

@yongseokoh sorry for not being able to send the comments in a single batch. i am still trying to get my head around the dmClock algorithm and our implementation.

tchaikov commented 3 years ago

@ivancich hi Eric, do you want to take a final look before this PR gets merged?

tchaikov commented 3 years ago

@yongseokoh just one last thing, could you squash the fix-up commits into the original change ? for instance, to fold "server: replace request_count() with !empty() for better performance " into "server: allow schedule_request returns NextReqType that indicates us ..." ?