Open David-Ongaro opened 5 months ago
Hmmm. Do you see an IllegalArgumentException anywhere in the logs that says "Proxying HTTP/2 messages not supported yet"?
I wonder if it's negotiating HTTP/2 and burying the exception somehow. I didn't think I changed anything with the HTTP/1 proxying; if so, that's an error. But I never used proxying, and never did much with it. It hasn't been implemented for HTTP/2 yet.
Hmmm. Do you see an IllegalArgumentException anywhere in the logs that says "Proxying HTTP/2 messages not supported yet"?
No, I don't see such a message. I also don't think it's using HTTP/2, but to be sure: is there a pool config to force HTTP/1.1? Then I can retry with that.
But I never used proxying, and never did much with it.
Still, you can try to reproduce the errors with a local proxy, like I showed above.
HTTP2 is explicitly opt-in, so unless you're setting it as a desired http version, it's http1 by default.
But I never used proxying, and never did much with it.
Still, you can try to reproduce the errors with a local proxy, like I showed above.
Oh, that sentence is more a note for Moritz, who's taken over as maintainer, that I changed virtually nothing with the proxy code itself, so the error probably lies elsewhere, like in the refactoring to support HTTP2, or the code path into the proxy code.
Since updating from 0.6.4 to 0.8.0 I see strange exceptions for proxied calls.
Decoding the hex string yields:
So it seems it tries to interpret an http response as https. (I try to reach a https endpoint via an http proxy, but maybe it's making an unproxied http call instead while still expecting https.)
It's difficult for me, though, to reproduce the proxy settings we use in prod locally. Therefore, I will use a local proxy for this bug report here:
After this, we can confirm that Aleph 0.6.4 works as expected
Trying the same in 0.7.0 or higher yields
Or alternatively, after a timeout of a minute or so:
Trying http yields a strange 503:
If I try https afterwards it seems it still tries to intepret the result as http:
although, I sometimes can get this error right away, without trying http first, so it doesn't seem to be a matter of misusing a pool for different protocols (which we don't do in prod anyway).
So even though I can not reproduce the exact error I see in prod locally, I think we can conclude that there is a regression in 0.7.0 and later compared to 0.6.4.