Closed temeddix closed 1 year ago
@dnadlinger Please tell me if you spot anything to modify more :)
I went through the comments on #101, looks like that was an oversight on our part. @dnadlinger, do you have a suggestion for a unit test that can trigger a race condition if no locking is present?
A test case that has a good chance at reproducing the issue would probably involve I/O operations that complete very quickly. It would by necessity be indeterministic, though; perhaps you could exchange many short bursts of data over a local socket pair to increase the chance of hitting it.
The test case that was ignored in #101 is probably a good start. By the way, it sure seems like any reference to asyncio's add_reader
/add_writer
in the discussion there was a red herring, as the methods in question are actually defined in _QEventLoop and build on Qt functionality.
You might also want not to copy over my commit with the work-in-progress FIXME messages over wholesale (and without attribution, I might add), but rather just cleanly revert the changes from #101 in one commit, and then in a second just add the few-line change moving the GetQueuedCompletionStatus()
call outside the loop (perhaps including a comment as to why this works). This way, it should be easier for somebody in the future to trace what the actual change here was. I can prepare a PR with just this change and a comment with a bit more of an explanation if you'd prefer.
On a related note, I still haven't look into the other issues indicated in the FIXME comments, and am unlikely to have the bandwidth for this soon, so if you wanted to spend some time investigating this, it would certainly be helpful.
I reverted the old commit, created a new one, and force-pushed with Git. Now the Git history will be easier to read.
I too don't have much time to inspect those 'FIXME' comments right now, so restoring those mutexes was the best I could do.
You've left all the fixme comments, word for word what was in the commit that @dnadlinger linked. Can you please submit 2 commits: one reverting #101 like you already did, and another one moving the mutex inside of the while loop and after the GetQueuedCompletionStatus() call.
Sorry if I misunderstood, but are those 2 commits without FIXME comments you're talking about?
Done :)
As @dnadlinger mentioned in this post, I decided to check if the performance can be fine while still using the mutex and semaphore, with the code he provided.
This change doesn't make the code identical to that of before #101. It slightly modifies the
_poll
method.The main points:
Please focus on the Median duration.
Before #101:
Before this change(After #101):
After this change:
The numbers(though there are small numerical errors) are telling us that restoring mutex this way doesn't negatively affect performance, but ensures safety as
qasync
did before #101. Please tell me if this change would be okay, @hosaka. I'm open to changes and will reply as fast as I can.