Closed seamusmac closed 2 years ago
@jfarcand So there is no java8 build on Travis for this yet. Any chance it could be added.
@seamusmac Working on it...need to remember this this work :-) BTW Would you like to becomes a contributor? You proved many times you can be trusted and I would really like you to help!
@jfarcand Thanks. Not really sure I have time to become a contributor at the moment. 3 kids, family, startup company etc. Always happy to try and fix things, or attempt something like this the odd time :)
Regarding this PR. I have upgraded to the latest Nettosphere snapshot (updated netty version in nettosphere) to try and get all the netty version in sync, and using the latest AHC version now.
There is one class not compiling at the moment, and it is the WebSocketTransport class, it extends AHCs WebSocketUpgradeHandler, cos some of the methods have been marked final. (onBodyPartReceived, onStatusReceived, onHeadersReceived, onCompleted, onSuccess, onFailure) I think we need to replace this with the AHC WebsocketListener, but I'm not totally sure, cos the old method don't map exactly etc.
If you had time to have a look, that would be awesome.
OK I understand...this is quite bad that compatibility isn't there. Will ping @slandelle for more info :-) as it won't work with WebsocketListener
Those methods are final on master because someone forgot to port his commit on the master branch back then: https://github.com/AsyncHttpClient/async-http-client/commit/4cd825805189e38c5a0d45f7acc34e4c585ac0e5 ;-)
I have mixed feelings about making those non final. wasync is probably the only one overriding them (first time I get a complain about those) and I'm afraid of opening a Pandora box: everything would break down if they were to return something else than what's implemented in WebSocketUpgradeHandler
.
IMHO, it would be way better to add protected hooks that would return void and whose default implementations would do nothing. Contributions welcome!
@slandelle OK ... but no time for me to spend on AHC so we may need instrument the classes...no time either :-) so let's put the migration on hold unless @seamusmac you have the cycle :-)
@jfarcand
So I tried a little more at getting this to work, just to see how wasync tests go, etc. I checked out the AHC project, modifed those methods that are final to allow the override ( we can patch this part later, depending on what @slandelle agrees on)
I had 36 tests fail after this upgrade.
I think I found a fix for all the WebSocketTests, in Nettosphere latest
https://github.com/Atmosphere/nettosphere/blob/master/server/src/main/java/org/atmosphere/nettosphere/BridgeRuntime.java#L418
I change that line to binaryData.copy().readBytes(body);
The AtmosphereResource didn't have the message cos the ByteBuf was already read I think.
There are other test failures in the StreamTransportTests, for example here is a log from one: https://gist.github.com/seamusmac/8ffd9e194684b3018f91c3e32055a3d0#file-gistfile1-txt-L1564
I tried a similar fix with the the BridgeRuntime.handleHttp method but didn't help. Don't have time to look more into it this week. Will see if I can possibly look next week.
@slandelle where you envisioning something like this for the WebSocketUpgradeHandler
:
// protected template methods for processing logic extension
protected void do_onStatusReceived(HttpResponseStatus responseStatus) throws Exception {}
protected void do_onHeadersReceived(HttpHeaders headers) throws Exception {}
protected void do_onBodyPartReceived(HttpResponseBodyPart bodyPart) throws Exception {}
protected void do_onCompleted() throws Exception {}
protected void do_onThrowable(Throwable t) {}
protected void do_onOpen() {}
@Override
public final State onStatusReceived(HttpResponseStatus responseStatus) throws Exception {
this.do_onStatusReceived(responseStatus);
return responseStatus.getStatusCode() == SWITCHING_PROTOCOLS ? State.CONTINUE : State.ABORT;
}
@seamusmac I see you prepare for a fix, would that fit your override needs ? @jfarcand do you see any obstacles in going that way ?
@thabach I think that would do fine. Lets see what the others have to say :)
+1 for me as well. Great work!
@thabach Yes. But please name the hooks as onStatusReceived0
(zero) and don't use this.
when it's not necessary. PR welcome.
@slandelle ok, shall do, whatever your preference. I like to use this
as an aid in comprehension that this call is going to be a dynamically dispatched call, but no worries ð their gonna be gone in the pull request.
pull-request done: AsyncHttpClient/async-http-client#1446
I fixed up those methods. Redid the PR for nettosphere and added a comment at: https://github.com/Atmosphere/nettosphere/pull/129
let me know what you think.
I think all that is left is to fix the tests, cleanup the HttpHeaders a little, and possibly get rid of the FluentStringsMap I added that was removed in the AHC 2+ onwards.
@seamusmac I found out that simple tests like the LongPollingTest::basicWebSocketTest
suffer from double socket:on("message", ...)
callbacks after the necessary adjustments in upgrading. With the first callback carrying the expected string and the second callback reporting an empty message string, which overrides the initial value and leads to the test failure (and fails many others in a similar "second callback overwrites initial value" - fashion). Before the upgrade (and adoptions of code) there was only one callback, i.e. the first we see now, but no consecutive, unexpected, additional callbacks.
@jfarcand, @slandelle does that ring a bell by any chance ?
@thabach I seen the same with StreamTransportTests too. I posted the full TRACE log from StreamingTest.basicWebSocketTest() test above. Here is it again, this line looks v. suspicious.
https://gist.github.com/seamusmac/8ffd9e194684b3018f91c3e32055a3d0#file-gistfile1-txt-L1564
@jfarcand I wonder are these problems in Nettosphere V3, That is the other upgrade I performed here, to get netty versions relatively in sync. How many people are using Nettosphere V3? I tried an upgrade once and failed.... :( I could be completely wrong though.. May need to try again
@seamusmac please try the latest nettosphere v3 snapshot of this morning, I tried to get it all in sync in regards to netty version used by both AHC and nettosphere.
@slandelle I can get 80% of the test pass if I adjust the org.asynchttpclient.netty.handler.HttpHandler
to not have the last chunk propagate. There is a line that reads // Netty 4: the last chunk is not empty
in the handleChunks(...)
-method. For me the last chunk for the exercised tests in our test suite is always empty.
Please advise...
@seamusmac No idea how many people are using Nettosphere 3 (but I do have customer) but it could potentially be the issue. Hard to say... I would need to look into it but right now it is totally impossible for me to work on that :-( Possibly in September once we have lunched
@slandelle actually it is even worse, I see the last chunk entering the org.asynchttpclient.netty.handler.HttpHandler
over and over again, endlessly... BUT some hours later into debugging the problem seems to be with us after all ðð
@seamusmac something with the handling of completing Future
s in wasync seems to be broken in this pull-request and results in propagating the last chunk endlessly
@jfarcand on another note, remind me for wasync with non-binary payload, do we expect to receive aggregated messages only in socket:on("message", ...)
@seamusmac as of changes in async-http-client in regards to the propagation of the terminating chunk (which have an empty body) and of it being pushed into AsyncHandler::onBodyPartReceived(...)
nonetheless for signalling the latter (at least this is the reason for pushing it from what I presume, @slandelle maybe you could give us the rational), we should add a precondition check in all classes and their respective onBodyPartReceived(...)
derived method, like this :
if (bodyPart.isLast()) { return AsyncHandler.State.ABORT; }
after I did that, I have still some tests failing, namely these
Failed tests:
status404FunctionTest(org.atmosphere.tests.LongPollingTest): expected [404] but found [200]
status404FunctionTest(org.atmosphere.tests.StreamingTest): expected [404] but found [200]
basicDecoderTest(org.atmosphere.tests.WebSocketsTest): expected object to not be null
basicEchoEncoderTest(org.atmosphere.tests.WebSocketsTest): expected [<-echo->] but found [null]
encodersChainingTests(org.atmosphere.tests.WebSocketsTest): expected [<-echo->] but found [null]
eventDecoderTest(org.atmosphere.tests.WebSocketsTest): expected object to not be null
multipleFireBlockingTest(org.atmosphere.tests.WebSocketsTest): expected [PINGPONG] but found []
multipleFireTest(org.atmosphere.tests.WebSocketsTest): expected [PINGPONG] but found []
postTest(org.atmosphere.tests.WebSocketsTest): expected object to not be null
testTimeoutAtmosphereClient(org.atmosphere.tests.WebSocketsTest): expected [true] but found [false]
defaultTypedTest(org.atmosphere.tests.TypedTest): expected object to not be null
predefinedMessageTest(org.atmosphere.tests.TypedTest): expected object to not be null
If you happen to have some time contributing into fixing those remaining test, that would be very appreciated.
Cheers, Christian
PS: the Future
chaining proved fine and the endless propagation of the last chunk, were pretty much instant replies of subsequent GET
s in long-polling (from the embedded nettosphere instance). Your suggested Nettosphere "fix" made sure that the subsequent GET
s were returning the same buffer again (which is of course is not what we generally want and why I reverted that change).
@slandelle - working on the upgrade of wasync
to an up-to-date AsyncHttpClient
version anew ðĪŠ - may I get your thoughts on the following in your code base:
In case of this being the last chunk (a chunk of 0<CR><LF>
) with an optional trailer, carrying additional header fields, which you propagate nicely in handleChunk()
right before the screenshot-ed lines above, why do you (caused by || last
) additionally choose to propagate the empty content-buf
?
It is not really a ResponseBodyPart
, and is not expected by some of the wasync
unit tests and make it fail. Is it done to signal the receipt of the last chunk ?
@seamusmac could you alter your pull request to prevent those last chunks from propagating in altering all our onBodyPartReceived
overrides with a conditional clause on exercising the up-call, like so:
this should make most (if not all) of the unit tests pass. Happy to hear how that goes - cheers Christian.
@seamusmac Any idea of how much work is left on this before it would be ready for merging? I'm a bit interested in getting it moved forward.
@slovdahl I haven't looked at this in quite a long time, and don't have the time currently to pick it up again. I pushed the last changes I had made based on @thabach replies above. There are currently 8 tests failing out of 117.
Failing Tests: basicIOExceptionTest(org.atmosphere.tests.LongPollingTest): Connection refused: no further information: /127.0.0.1:49289 status404FunctionTest(org.atmosphere.tests.LongPollingTest): expected [404] but found [200] serializeFutureGetTest(org.atmosphere.tests.LongPollingTest): expected [PINGPONG] but found [PINGPONG status404FunctionTest(org.atmosphere.tests.StreamingTest): expected [404] but found [200] basicConnectExceptionTest(org.atmosphere.tests.WebSocketsTest): Connection refused: no further information: /127.0.0.1:58835 basicIOExceptionTest(org.atmosphere.tests.WebSocketsTest): Connection refused: no further information: /127.0.0.1:59470 serverDownTest(org.atmosphere.tests.WebSocketsTest): Connection refused: no further information: localhost/0:0:0:0:0:0:0:1:815 testTimeoutAtmosphereClient(org.atmosphere.tests.WebSocketsTest): Connection refused: no further information: /127.0.0.1:62768
Last I remember trying to fix was the status404FunctionTest ones, and I think that requires a fix in Nettosphere, cos the GET requests are returning 200 Ok, when they should be returning a 404. Nettosphere just logs at https://github.com/Atmosphere/nettosphere/blob/9d6f024a471f188a20d5669cbff20c1186c79aa7/server/src/main/java/org/atmosphere/nettosphere/BridgeRuntime.java#L690, and no error happens afterwards. That should fix 2 more. and the rest I'm not sure about.
Feel free to fix/modify/take the code if you want to give it a try.
Thanks @seamusmac, that's very helpful!
don't have the time currently to pick it up again
No worries, a huge thanks for the effort you've put into it this far :bow:
I'll put some effort into it now and see if I can get any further with the test failures.
@seamusmac @slovdahl I've made some progress, at least with the timing issue of IOException
.
@jfarcand @slovdahl That brilliant that you made some progress on this. I pulled your code @jfarcand and gave it a quick run here locally on my windows machine.
Looks like only one test failing now. timeoutTest(org.atmosphere.tests.WebSocketsTest): expected [OPENCLOSEERROR] but found [OPENCLOSEERRORERROR]
I also left a comment on another TODO that I had put in, not sure the best way to handle those.
@seamusmac I don't see your commit? Maybe you haven't pushed it on the proper branch? I still see some random failure but I'm tempted to merge your pull request and continue on the master branch. I will wait for your commit first.
@jfarcand sorry I don't have any other commit, or time to look at this in detail. Was just informing you about the TODO around the IOExceptions on the asyncclient.close() methods above.
Feel Free to merge this PR if you wish, and fix those niggly bits yourself. You'll have it done in a few mins where it would take me 10 times as long to figure out.
Seeing as this is a major change, would it be a good idea to do a major version bump for wasync?
@seamusmac Yes, major version for sure!
@seamusmac I've released 3.0.0 :-) The test are flacky as they all pass locally, so I went ahead and cut 3.0.0. Let me know if you are facing any issue.
Wow, thank you @jfarcand! :bow: This is awesome.
Ran into this bug https://github.com/AsyncHttpClient/async-http-client/issues/891
So tried upgrading the async-http-client version.
Probably not ready to submit yet. Submitting PR to run the tests to see how it looks.