Closed aiudirog closed 4 years ago
I spent longer than I'm willing to admit trying to fix that bug thinking the readall()
caused it, when in fact it only forced it to show itself:D.
Merging #99 into master will decrease coverage by
0.01%
. The diff coverage is93.33%
.
@@ Coverage Diff @@
## master #99 +/- ##
==========================================
- Coverage 80.87% 80.86% -0.02%
==========================================
Files 35 35
Lines 3529 3532 +3
==========================================
+ Hits 2854 2856 +2
- Misses 675 676 +1
Impacted Files | Coverage Δ | |
---|---|---|
thriftpy2/contrib/aio/transport/buffered.py | 92.1% <100%> (-1.52%) |
:arrow_down: |
thriftpy2/contrib/aio/socket.py | 78.35% <91.66%> (+0.87%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 9b23e29...5ccbd0a. Read the comment docs.
Could you please provide more information about the reason for randomly failing? And which exception you got in the failing?
The error that actually bubbled up is struct.error
emitted on one of the unpack lines like this. The specific line that it errored on changed depending on the test case (the data being sent back I assume), but the common thread was that not enough data was getting read from the socket so, for example, only 3 bytes were passed into unpack_int32()
.
Upon debugging, I found I could read more data from TAsyncSocket.read()
if I called it again, so it wasn't that HBase was dropping data. Then I noticed that unlike all of the other transports I had played with, TAsyncSocket.read()
didn't always return the length I had asked for, it just returns whatever the lower level socket returned. By adding the readall()
wrapper, like the other transports, this issue was alleviated.
However, upon doing that, another error cropped up. All of a sudden, the socket was timing out in the readall()
loop trying to read all of the data. That was when I discovered the bug in TAsyncBufferedTransport.read()
. Every call to TAsyncSocket.read()
on that line was always getting 4096 (buf_size
), instead of sz
because max was used instead of min. Since I had fixed the previous readall()
bug, now it was actively trying to read 4096 bytes every time, when it should have been reading like 10.
I have a theory that the second bug is actually the reason the first one wasn't discovered sooner. Since TAsyncBufferedTransport
was always asking the socket for 4096 bytes, all data was always consumed in the test cases because they are smaller than that size. My tests of bulk scanning HBase yielded much closer to or above the buffer size limit. This meant not all of it was being returned on the first call to TAsyncSocket.read()
, therefore causing it to not consume that data.
Edit: I guess I forgot to address the randomness. I assume it was due to variations in delays between HBase sending data and my code reading it. Sometimes the test cases would pass, especially in the debugger (which made it very fun to track down). So, if you pause right before the read, the socket has time to buffer all of the data and the single read call might be able to consume it all.
Great stuff! I think you are right, it looks good to me, thank you.
Wow that was fast, thanks!
Hi @aiudirog, I think there is a simpler way to fix it, use readexactly
rather than read
, see https://docs.python.org/3/library/asyncio-stream.html#asyncio.StreamReader.readexactly. So sorry I will try to revert this pull request and test it.
I had no idea that method existed. I'm testing it out now with my happybase code.
A couple of test cases fail where it is asked to read 0 bytes (raises an EOF error). I'll make the changes and update this PR.
Thank you. I will await your response.
Okay, I've got it, this passed all thriftpy2 and aiohappybase test cases: https://github.com/aiudirog/thriftpy2/commit/16bbf52f3988c2766bb0162df25fad4298ea8d4c
Fun how simple it is with that method, haha.
Maybe force reset and I'll open a new PR?
I have reverted it.
And please process IncompleteReadError
when call readexactly
, and re-raise TTransportException.END_OF_FILE
.
While accessing HBase via asyncio, I found that my test cases would randomly fail. It turned out that not all of the data was being consumed by
TAsyncSocket.read()
, so I refactored it to use thereadall()
thatTAsyncBufferedTransport
uses.I moved
readall()
into socket.py because it made sense to put it with the lower level implementation.