Open Rinzwind opened 5 years ago
Hi,
As far as I understand it, the existing stream protocol is not perfect since it mixes both returning nil and signalling exceptions to indicate EOF situations (on top of that there is/was actually no EOF exception).
I would define #atEnd to mean that we know for sure that the stream can not ever return any more data, but at the same time it should not block.
In general #next is defined to return an element, and block waiting for one. If the stream is #atEnd, it should return nil. It can throw exceptions.
For a network stream it is sometimes in a state where it does not know if it is really #atEnd, the only way to find out it to try reading, but that would block.
So the situation you describe certainly does exist, but is handled correctly, as far as I understand.
I would say that #atEnd true is a certainty, but #atEnd false not (always).
Maybe #readInto:startingAt:count:fromStream: should handle ZnIncomplete.
But going one level up, if the response has no content length, that means its entity should be read until EOF, it also means that the server should write that response out and close the connection, which would normally be seen by the client. Are you sure the server did that ?
Sven
But going one level up, if the response has no content length, that means its entity should be read until EOF, it also means that the server should write that response out and close the connection, which would normally be seen by the client. Are you sure the server did that ?
I assume the server closed the connection, as otherwise the send of #next
would not have returned nil
, but would have signaled a ConnectionTimedOut
no?
I would define #atEnd to mean that we know for sure that the stream can not ever return any more data, but at the same time it should not block.
In general #next is defined to return an element, and block waiting for one. If the stream is #atEnd, it should return nil. It can throw exceptions.
So that means the bug is actually in ZnUTF8Encoder>>#nextCodePointFromStream:
: it should not treat a return value of nil
from its first send of #next
as an error, but as an “end of stream”?
As a workaround, I have patched the #atEnd
condition in ZnUTF8Encoder>>#readInto:startingAt:count:fromStream:
from stream atEnd ifTrue: ...
to stream peek ifNil: ...
. At first glance, that seems to fix the problem we are having with some of our tests.
OK, an alternative could be
[ codePoint := self nextCodePointFromStream: stream ] on: ZnIncomplete do: [ ^ index - offset ].
which, if it works, would be more intention revealing.
OK, an alternative could be […]
I don't think that would be correct, as it would “swallow” every ZnIncomplete
. So the following would return 'e'
rather than signaling a ZnIncomplete
(as it should, due to the last byte of the encoding of é
being cut off):
example := ByteArray streamContents: [ :s | (ZnResponse ok: (ZnStringEntity text: 'eé')) writeOn: s ].
(ZnResponse readFrom: (example copyFrom: 1 to: example size - 1) readStream) contents.
I'm wondering: are there other Stream
-like classes where false
from #atEnd
means the stream may not be at the end, rather than that is not at the end? I was not aware of having to take that into account when using streams, and ZnUTF8Encoder
also does not seem to take that semantics into account. Shouldn't ZdcSocketStream
offer a separate #atEndNonBlocking
, with #atEnd
being blocking?
Yes, there is indeed a difference between EOF at a character boundary and an incomplete UTF8 encoding. I have to think a bit more about this, and I have to try to remember what the reasoning was when I introduced ZnIncomplete
@Rinzwind @svenvc Hi, we faced the same problem recently and we tried the workaround below. This is not fundamental but works well for us.
!ZdcIOBuffer methodsFor: '*zodiac-accessing-override'!
availableForReading
"How many elements can be read ?"
^ (writePointer - readPointer) max: 0! !
!ZdcIOBuffer methodsFor: '*zodiac-accessing-override'!
availableForWriting
"How many elements can be written at the end ?"
^ (buffer size - writePointer) max: 0! !
I hope this workaround helps you.
Hi,
Very interesting, but could you please elaborate a bit more ? How exactly does this solve the problem ?
In principle, the invariant: readPointer <= writePointer should hold for ZdcIOBuffer. If you need to add the max: 0 that would suggest the variant is broken. And then we have to understand how this is possible.
Thx,
Sven
Hello Mr. Sven,
We noticed stream atEnd
returns false
in ZnUTF8Encoder>>readInto:startingAt:count:fromStream:
.
readInto: string startingAt: offset count: requestedCount fromStream: stream
"Read requestedCount characters into string starting at offset,
returning the number read, there could be less available when stream is atEnd.
I signal a ZnByteStringBecameWideString notification if necessary"
| stringBuffer |
stringBuffer := string.
offset to: offset + requestedCount - 1 do: [ :index | | codePoint |
stream atEnd ifTrue: [ ^ index - offset ].
codePoint := self nextCodePointFromStream: stream.
...
And then stream next
in ZnUTF8Encoder>>nextCodePointFromStream:
raised ZnIncomplete
.
nextCodePointFromStream: stream
"Read and return the next integer code point from stream"
| code byte next |
(byte := stream next ifNil: [ ^ self errorIncomplete ]) < 128
...
Our system calls a Web API that returns XML response without 'Content-Length'
response header, and the response contains some Japanese characters.
In the ZnStringEntity>>readFrom:
, buffer size is set ZnUtils streamingBufferSize
because the content length is nil
.
And somehow readPointer
will be over actual stream data size (I think it is caused by the problem around multibyte index calculation, so our workaround is not a fundamental solution).
At last ZdcIOBuffer>>availableForReading
returns non-zero value, ZdcIOBuffer>>isEmpty
and ZdcSocketStream>>atEnd
return false
, but next moment stream next
returns nil
.
It will not occur when you use ascii text, or the response headers contain 'Content-Length'
.
In our case, our workaround fixed ZdcSocketStream>>atEnd
behavior and solved the problem.
I appreciate your detailed description.
This is a difficult bug nonetheless. I've put a conditional breakpoint on both methods with as condition the returned expression being negative. Then I ran all Zn+Zdc tests and the condition was not met.
I am on macOS, 64-bit pharo 7.0.3
From your description the following conditions are needed:
I am guessing the following are also needed:
Needless to say, we should find a easy to reproduce case that triggers the problem reliably.
Hi,
Actually, We have found the issue when calling Pardot (Salesforce) API with ZnClient. http://developer.pardot.com/kb/api-version-4/visitors/#assigning-and-reassigning-visitors
The API returns a large utf8 encoded XML document in https without content-length property. Pardot is commercial service, so it may be difficult to try the API. But I hope it would help a bit.
a large response (probably larger than some internal buffer sizes)
I can't upload actual response because it contains private data, but the response size is 7676 bytes.
windows
Sorry, I didn't mention our develop environment. We use macOS too, and pharo version is 64-bit 6.1. So windows is not needed.
Above I wrote:
As a workaround, I have patched the
#atEnd
condition inZnUTF8Encoder>>#readInto:startingAt:count:fromStream:
fromstream atEnd ifTrue: ...
tostream peek ifNil: ...
. At first glance, that seems to fix the problem we are having with some of our tests.
We wound up using a different workaround. We added an override for #atEnd
on ZdcSocketStream
as an extension:
atEnd
readBuffer isEmpty ifFalse: [ ^ false ].
self isConnected ifFalse: [ ^ true ].
self fillReadBuffer.
^ readBuffer isEmpty
We thought that was a bit cleaner than patching an existing method. This override changes #atEnd
to be blocking instead of non-blocking. This ties in with what I was wondering about above:
I'm wondering: are there other
Stream
-like classes wherefalse
from#atEnd
means the stream may not be at the end, rather than that is not at the end? I was not aware of having to take that into account when using streams, andZnUTF8Encoder
also does not seem to take that semantics into account. Shouldn'tZdcSocketStream
offer a separate#atEndNonBlocking
, with#atEnd
being blocking?
@Rinzwind what if you tried the other workaround as well ? It would be interesting to know if we are talking about the same thing.
It is also very important to have a problematic situation that can be reliably reproduced.
@Rinzwind what if you tried the other workaround as well ?
Do you mean the patch on ZnUTF8Encoder>>#readInto:startingAt:count:fromStream:
(changing stream atEnd ifTrue: ...
to stream peek ifNil: ...
). That also fixes the problem we were having with our tests. It's just that when patching an existing method, the patch sometimes gets reverted to the non-patched implementation when updating packages. As ZdcSocketStream
doesn't have an existing method for #atEnd
(it inherits it), we thought adding an override for #atEnd
there was a better solution. But either solution is sufficient for us to fix the problem we were having with our tests.
As for which of these solutions to integrate into Zinc/Zodiac is up to you to decide. The first one is to make ZnUTF8Encoder
take into account that #atEnd
on ZdcSocketStream
is non-blocking, the other one is to make #atEnd
on ZdcSocketStream
blocking. I'm still wondering though whether there are other Stream
-like classes where false
from #atEnd
means the stream may not be at the end, rather than that is not at the end? Is there any reason not to make ZdcSocketStream
offer a separate #atEndNonBlocking
, with #atEnd
being blocking?
It is also very important to have a problematic situation that can be reliably reproduced.
I'm afraid I can't easily share our test setup in which the problem occurs. The key to being able to reproduce the problem though would seem to be to send a request to some server that gives responses with an UTF8-encoded entity and no Content-Length
header (as otherwise, the ZdcSocketStream
is wrapped in a ZnLimitedReadStream
for which #atEnd
returns true
as soon as the limit is reached, regardless of whether #atEnd
on the ZdcSocketStream
still returns false
).
Thanks again for the feedback and the discussion. There is much to think about.
But first I am focusing on building a reliable reproducible test case.
Hi,
Our workaround was not effective on Linux.
But the problem solved when we use ZnClient>>beBinary
before calling ZnClient>>request:
and ZnClient>>execute
.
Therefore, the response has already been decoded and may be decoded again somewhere.
OK, noted.
If you load the latest commit (fixed a small issue related to connection closing), you can try the following example, it is a first step to set up a server that gives the response as described.
server := ZnServer startOn: 1701.
"small"
server onRequestRespond: [ :req | | resp |
resp := ZnResponse ok: (ZnEntity textCRLF: 'Testing 123').
resp setConnectionClose.
resp headers clearContentLength.
resp ].
"larger, non-ascii"
server onRequestRespond: [ :req | | resp |
resp := ZnResponse ok: (ZnEntity textCRLF: '10 €' , (String loremIpsum: 10000) , ' 10 €').
resp setConnectionClose.
resp headers clearContentLength.
resp ].
"access the response"
ZnClient new get: 'http://localhost:1701'.
However, this still works for me.
I assume this example does not (easily) reproduce the issue because the time between writing the last byte of the response and the closing of the connection is too short. So I added a delay of 100 milliseconds to ZnMultiThreadedServer>>#closeSocketStream:
; as shown on the screenshot below, running the example then leads to the ZnIncomplete
error:
The change to ZnMessage>>#wantsConnectionClose
is the fix done in https://github.com/svenvc/zinc/commit/44b599afee3cd1501e3a66f8dca672dfc17e485d.
Excellent, work. Yes, this fails for me too. It illustrates your earlier points well. Thank you.
A non-destructive way to arrive at the same server behaviour is the following: creating a ZnRequest subclass that overrides the #useConnection: hook (which is used by the WebSockets implementation).
ZnResponse subclass: #ZnResponseWith100msDelay
instanceVariableNames: ''
classVariableNames: ''
package: '_UnpackagedPackage'
ZnResponseWith100msDelay>>#useConnection: connection
100 milliSeconds wait
And then use it as follows:
server := ZnServer startOn: 1701.
server onRequestRespond: [ :req | | resp |
resp := ZnResponseWith100msDelay ok: (ZnEntity textCRLF: 'Testing 123').
resp setConnectionClose.
resp headers clearContentLength.
resp ].
The response size, nor it being a WideString seems relevant.
I now have enough to start working on a solution (I think both returning nil from EOF streams when decoding characters, as well as a blocking #atEnd are needed/useful). I hope this won't break anything else.
I'm wondering whether there is a bug in the implementation of
atEnd
onZdcSocketStream
(inherited fromZdcSimpleSocketStream
):https://github.com/svenvc/zodiac/blob/e24882abdd4c7edbd5da8318131465704eb61672/repository/Zodiac-Core.package/ZdcSimpleSocketStream.class/instance/atEnd.st#L2-L13
So as far as I understand: if the buffer is empty, and no data is available, the method will return
false
if the socket is still connected. But that does not guarantee that the stream is not at the end, as the socket may simply be disconnected without additional data becoming available? Or am I misunderstanding something about how this works?Case in point: we have some tests that sometimes fail while reading a
ZnResponse
, which seems to be due to#atEnd
returningfalse
, and a subsequent#next
returningnil
. As shown on the screenshots below: aZnIncomplete
is signaled while reading aZnEntity
because#next
returnednil
, although this is just after checking whether the stream is#atEnd
, which must have returnedfalse
. (The stream should actually be at the end, as in this test, the response is expected to containGenerating report 'eventfiche'
(size: 30) andindex
in#readInto:startingAt:count:fromStream:
is at 31.) Note that the response does not include aContent-Length
header, so the stream has not been wrapped in aZnLimitedReadStream
.