Closed chschnell closed 7 hours ago
I have a small idea for fetch-based networking that I would first like to check with you before committing.
Currently, the full HTTP response (header and body) from fetch() is cached before being forwarded to the originating HTTP client.
My proposal is to only wait for the HTTP response headers from fetch(), and then to forward the response body asynchronously chunk by chunk to the originating HTTP client using ReadableStream and HTTP Chunked transfer encoding.
To test this, I changed on_data_http() in fetch_network.js based on a ReadableStream usage pattern from MDN, it's pretty straight-forward and it works well with my tests (all HTTP applications support chunked Transfer-Encoding to my knowledge). This would also remove the need for FetchNetworkAdapter.fetch(), I merged its relevant code into on_data_http().
The upside is that this much, much increases both the responsiveness as well as the effective speed of fetch-based networking, all the more with increasing size of the HTTP response.
The downside is that the originating HTTP client never gets to see the Content-Length of the response it is receiving, thus it cannot show any meaningful progress statistics anymore other than about what it has received so far. Any Content-Length must be ignored if any Transfer-Encoding is used, and the Content-Length from the fetch response header can be incorrect (for example, for compressed response bodies). The only way around this seems to be the prefetching like it currently is. I might be wrong, but to my knowledge it's categorically either prefetching or the loss of Content-Length (for any request from the originating HTTP client). I wasn't aware of that until I had finished implementing it.
I would of course wait with this until #1182 is merged, and include any changes from there.
I'm really undecided which side outweighs the other here. What do you think, thumbs up or down?
I'm really undecided which side outweighs the other here. What do you think, thumbs up or down?
Generally thumbs up, but I'm unsure about the downsides. I wonder if HTTP Chunked transfer encoding is necessary at all. Can't the data be streamed in separate ethernet packets?
and the Content-Length from the fetch response header can be incorrect (for example, for compressed response bodies).
I believe for compressed bodies (via Content-Encoding), Content-Length
indicates the length of the compressed body. As far as I know, compression via Transfer-Encoding
isn't really used or widely implemented.
EDIT: The gist of the matter is that I cannot determine whether Content-Length
from the fetch() response headers is correct, so I cannot use it at all and I have no Content-Length
to pass to the originating HTTP client. The only option I see at this point is to use chunked encoding.
If I could rely on Content-Length
in the received fetch() response headers then I could just pass the unmodified chunks on to the originating client, passing the very same Content-Length
to it.
I build and send the response headers to the originating client when I receive the first response body chunk, but I don't have a Content-Length
to set in these response headers (well I do, but it is wrong when compression is used).
In fetch(), my browser negotiates gzip compression with my Apache server and transparently decompresses the response before passing me the decompressed chunks (and passes me the compressed size in Content-Length, which is useless). I never get to know how long the response is going to be myself.
I tried to suppress compression but Accept-Encoding
is amongst the forbidden header names in fetch, so I can't set that header in the fetch request.
I wonder, is there any other method besides Chunked transfer encoding that I could use when I don't know the length of the response?
In fetch(), my browser negotiates gzip compression with my Apache server and transparently decompresses the response before passing me the decompressed chunks (and passes me the compressed size in Content-Length, which is useless). I never get to know how long the response is going to be myself.
I see the problem now. fetch
passes you already decompressed chunks, but Content-Length
indicates the compressed size on the wire.
I wonder, is there any other method besides Chunked transfer encoding that I could use when I don't know the length of the response?
It seems to be legal to send an http reponse with neither Content-Length
nor Transfer-Encoding: chunked
, in which case the server closes the TCP connection after writing all data.
I wonder, is there any other method besides Chunked transfer encoding that I could use when I don't know the length of the response?
It seems to be legal to send an http reponse with neither
Content-Length
norTransfer-Encoding: chunked
, in which case the server closes the TCP connection after writing all data.
I actually do all this to learn and also try out new things, so thank you very much for this information!
It should fit in well since I'm already closing the HTTP connection to the originating client once the transaction is complete (current fetch-based networking can't do that because of the FIN packet that is sent too early).
It works without both Content-Length and chunked encoding! :)
Of course, the originating client still doesn't get to learn the Content-Length of the response body, so it cannot show any meaningfull download progress information besides the total bytes received so far.
Looks good to merge to me (one more small comment above). @chschnell Any objections? (asking because it's still marked as a draft)
Looks good to merge to me (one more small comment above). @chschnell Any objections? (asking because it's still marked as a draft)
I'm still not sure whether you want the fetch()-improvement or not, I can check in the code any time, it's not that much code and also well-tested.
The question is whether you prefer to have the performance boost or to keep the Content-Length intact.
I tend to the performance boost, but I'm in no position to make that decision.
The question is whether you prefer to have the performance boost or to keep the Content-Length intact.
How about using the default way for small responses or when the custom x-fetch...
header is set (like a fallback)?
The question is whether you prefer to have the performance boost or to keep the Content-Length intact.
Yeah, let's do proper chunking and give up the Content-Length
. (perhaps leave the old path in the code, so it could be toggled if there are any problems).
The question is whether you prefer to have the performance boost or to keep the Content-Length intact.
Yeah, let's do proper chunking and give up the
Content-Length
. (perhaps leave the old path in the code, so it could be toggled if there are any problems).
Ok, I hope this was the last commit, from my side this PR is ready to go.
I've commented out two blocks of code in fetch_network.js (for fallback), and put a single new one in. Some kind of software switch could be added to toggle between the two implementations, but I tried to keep it simple.
@SuperMaxusa It should be easy to extend the code as it is now to a two-way solution, for example to prefetch the body when it is not too large, or when the originating client's HTTP request contained a specific header. If anyone wants it I could add it in a later PR.
It should be easy to extend the code as it is now to a two-way solution, for example to prefetch the body when it is not too large, or when the originating client's HTTP request contained a specific header.
Yes, since most HTTP clients can handle these responses without Content-Length
(e.g. on KolibriOS, DSL with Dillo and Firefox, Windows 95 with Internet Explorer 4.0, this works properly), let's leave your implementation with ReadableStream.
By the way, in many CI runs, the mocked.example.org
test failed (as an exception, this behaviour is allowed), while the test with real example.org
is more reliable, and after the recent changes, this test doesn't work. Can we remove this test?
e.g. on KolibriOS, DSL with Dillo and Firefox, Windows 95 with Internet Explorer 4.0, this works properly
@SuperMaxusa Thanks a lot for testing, that is really helpful of you!
By the way, this transfer model (transfer without Content-Length and to use end-of-connection to indicate end-of-response-body) is how HTTP 1.0 used to work, and support for it is still required in HTTP 1.1 (here the relevant section). I guess for these two reasons it is widely support.
It should be easy to extend the code as it is now to a two-way solution, for example to prefetch the body when it is not too large, or when the originating client's HTTP request contained a specific header.
By the way, in many CI runs, the
mocked.example.org
test failed (as an exception, this behaviour is allowed), while the test with realexample.org
is more reliable, and after the recent changes, this test doesn't work. Can we remove this test?
Starting test #6: Curl mocked.example.org
Captured: wget -T 10 -O - mocked.example.org
Captured: echo -e done\\tmocked.example.org
Captured: ~% wget -T 10 -O - mocked.example.org
Captured: Connecting to mocked.example.org (192.168.87.1:80)
Fetch Failed: http://mocked.example.org/
TypeError: fetch failed
Captured: wget: server returned error: HTTP/1.1 502 Fetch Error
Captured: ~% echo -e done\\tmocked.example.org
Captured: done mocked.example.org
[...]
Unless there's more to it, Test #6 in tests/devices/fetch_network.js fails correctly, it attempts to fetch() from http://mocked.example.org/ which is an invalid DNS name. The next test nr. 7 attempts to fetch() from http://example.org/ wich is a valid URL, and this test succeeds.
I am not sure, but in case you mean test nr. 6, maybe it is intended to just test that a failure in fetch() is handled correctly? Was this test ever meant to succeed?
it attempts to fetch() from http://mocked.example.org/ which is an invalid DNS name
This test has a custom fetch for which http://mocked.example.org
is an existed website, see also https://github.com/copy/v86/pull/1061#discussion_r1676848522 and https://github.com/copy/v86/pull/1061/commits/6518770365d65ef8fe9ea9d06630c61539858ca2.
When the FetchNetworkAdapter.prototype.fetch
has been commented out and replaced by fetch with ReadableStream, it's not called and the test fails.
this transfer model (transfer without Content-Length and to use end-of-connection to indicate end-of-response-body) is how HTTP 1.0 used to work, and support for it is still required in HTTP 1.1
Interesting, I would never have thought this works as HTTP 1.0 compatibility, thanks for that information!
it attempts to fetch() from http://mocked.example.org/ which is an invalid DNS name
This test has a custom fetch for which
http://mocked.example.org
is an existed website, see also #1061 (comment) and 6518770.When the
FetchNetworkAdapter.prototype.fetch
has been commented out and replaced by fetch with ReadableStream, it's not called and the test fails.
Ah I see, thanks for the explanation.
It's true that this test cannot work anymore because I removed FetchNetworkAdapter.fetch()
, I wasn't aware of this side-effect.
I would also argue (with my limited oversight) that this test is no longer needed with the changed class layout.
On a second thought, FetchNetworkAdapter.fetch()
could stay, it no longer gets called by on_data_http()
but exclusively by that test function now.
Perhaps a bit of a stretch, but maybe someone needs the blocking fetch() in the future and then FetchNetworkAdapter.fetch()
would be useful to have.
Related note I just noticed since this touches wisp_network. If a connection was closed in wisp and the v86 VM tried sending a TCP packet it crashes the whole thing. It may be worth just adding some extra checks around that code just to make sure everything's behaving fine
This PR might fix it since it touches closing logic, I'm just not sure
Wisp does support bidirectional stream closure (I think I just screwed up how I handled it in v86), is there some issue you encountered with it? Not sending the voluntary close may cause resource leaks for both the client and the server
Related note I just noticed since this touches wisp_network. If a connection was closed in wisp and the v86 VM tried sending a TCP packet it crashes the whole thing. It may be worth just adding some extra checks around that code just to make sure everything's behaving fine
This PR might fix it since it touches closing logic, I'm just not sure
I did not come across this when writing the PR. I mostly changed fake_network.js, wisp_network.js only minimally where neccessary.
As a test just now, I killed my WISP server (epoxy) while V86 was running without using the network, and after a few seconds it throws an exception in the browser console. Then I restarted the WISP server, and the same V86 instance picked it up just fine, network was ok.
Then I killed the WISP server while having a telnet session open in V86, caused some traffic, restarted the WISP server, tried to exit telnet which took around half a minute (two exceptions or so), but then it came back and I was able to start a new telnet session without problems, stil in the same V86 instance.
It looks stable to me (V86 survived), but maybe I'm not testing right. How can I provoke this situation you mean?
Wisp does support bidirectional stream closure (I think I just screwed up how I handled it in v86), is there some issue you encountered with it? Not sending the voluntary close may cause resource leaks for both the client and the server
I added a WISP frame of type CLOSE with reason 0x02 (voluntary stream closure) to the V86 WISP client just a couple of days ago and removed it again a couple of hours ago after I was told yesterday (in discussion #1175) that this wasn't supported.
I could add it back in easily, I left a comment in fake_network.js in my branch where this would belong (it's where we receive a TCP packet with FIN flag from V86 while in TCP state ESTABLISHED).
The reason I added it was to close only V86's side of the TCP connection (so-called "half-close"), for reasons see here:
The example from the first link is quite simple (see there for the full explanation):
ssh hostname sort < datafile
This needs TCP half-close in order to work. IIRC then FTP could also be hindered from working without support for half-close.
I'm a bit confused now. Does WISP support TCP half-close? What does CLOSE with reason 0x02 (from the client) exactly do, and is it generally supported by WISP servers?
I've never heard of half close, okay I see, wisp doesn't support that. Does full closing still work? I thought that commit removed full closing and that's why I was concerned
close 0x02 0x04 closes the TCP stream entirely, no more receiving or transmitting can be done. The server doesn't really care about the reason. It's there more so as courtesy
Edit: fix wrong packet type
I've never heard of half close, okay I see, wisp doesn't support that. Does full closing still work? I thought that commit removed full closing and that's why I was concerned
close 0x02 closes the TCP stream entirely, no more receiving or transmitting can be done
Only WispServerCpp implements CLOSE/0x02 like that, neither epoxy-tls nor wisp-js do, in my observation. There was no CLOSE/0x02 in V86's WISP client before I added it a couple of days ago.
So in the case of WispServerCpp that commit did remove full closing, but in the case of the other two WISP servers it didn't change anything. I did that because none of them implemented half-close. Now it all comes down to a timeout by the server, or a leaked connection in case that never happens.
Maybe I should put the CLOSE/0x02 better back in, even though it will break certain applications and has no effect with some WISP servers.
Are you saying the other wisp servers don't support closing or don't support half closing? Sorry I think I'm misunderstanding
To my understanding every current wisp server does support full closing of streams from either direction (client and server can both report closed streams I believe)
Also close is 0x04, the reason is 0x02. Just wanted to clarify because I stated it incorrectly above in my own message
I've never heard of half close, okay I see, wisp doesn't support that. Does full closing still work? I thought that commit removed full closing and that's why I was concerned
close 0x02 closes the TCP stream entirely, no more receiving or transmitting can be done
Only WispServerCpp implements CLOSE/0x02 like that, neither epoxy-tls nor wisp-js do, in my observation. There was no CLOSE/0x02 in V86's WISP client before I added it a couple of days ago.
So in the case of WispServerCpp that commit did remove full closing, but in the case of the other two WISP servers it didn't change anything. I did that because none of them implemented half-close. Now it all comes down to a timeout by the server, or a leaked connection in case that never happens.
Maybe I should put the CLOSE/0x02 better back in, even though it will break certain applications and has no effect with some WISP servers.
@chschnell
The protocol does not specify any special behavior on the server side for specific close reasons. I don't know why WispServerCpp behaves differently here, but wisp-js and epoxy-server are following protocol and closing fully. Wisp V1 reference Wisp V2 reference
Any CLOSE packets sent from either the server or the client must immediately close the associated stream and TCP socket. The close reason in the payload doesn't affect this behavior, but may provide extra information which is useful for debugging.
On epoxy, you can use RUST_LOG=trace
to see that it does close streams (stream uuid f9f5dcdb-5edc-40d2-9ba7-b7686b6cfc19 disconnected for client id "127.0.0.1:46658"
):
[2024-11-21T20:32:37Z TRACE epoxy_server] routed "127.0.0.1:46658": Wisp
[2024-11-21T20:32:37Z DEBUG epoxy_server::handle::wisp] new wisp client id "127.0.0.1:46658" connected with extensions [1], downgraded false
[2024-11-21T20:32:37Z TRACE epoxy_server::handle::wisp] sent ping to wisp client id "127.0.0.1:46658"
[2024-11-21T20:32:37Z TRACE epoxy_server::handle::wisp] sent ping to wisp client id "127.0.0.1:46658"
[2024-11-21T20:32:37Z DEBUG epoxy_server::handle::wisp] new stream created for client id "127.0.0.1:46658": (stream uuid f9f5dcdb-5edc-40d2-9ba7-b7686b6cfc19) ConnectPacket { stream_type: Tcp, destination_port: 8000, destination_hostname: "localhost" } ConnectPacket { stream_type: Tcp, destination_port: 8000, destination_hostname: "127.0.0.1" }
[2024-11-21T20:32:37Z DEBUG epoxy_server::handle::wisp] stream uuid f9f5dcdb-5edc-40d2-9ba7-b7686b6cfc19 disconnected for client id "127.0.0.1:46658"
[2024-11-21T20:33:07Z TRACE epoxy_server::handle::wisp] sent ping to wisp client id "127.0.0.1:46658"
[2024-11-21T20:33:37Z TRACE epoxy_server::handle::wisp] sent ping to wisp client id "127.0.0.1:46658"
[2024-11-21T20:33:59Z DEBUG epoxy_server::handle::wisp] wisp client id "127.0.0.1:46658" multiplexor result Ok(())
[2024-11-21T20:33:59Z DEBUG epoxy_server::handle::wisp] shutting down wisp client id "127.0.0.1:46658"
[2024-11-21T20:33:59Z TRACE epoxy_server::handle::wisp] waiting for tasks to close for wisp client id "127.0.0.1:46658"
[2024-11-21T20:33:59Z DEBUG epoxy_server::handle::wisp] wisp client id "127.0.0.1:46658" disconnected
Also wisp server Cpp sends a 0x04 Close packet back after the client sends one. It is the only server to do this, and it actually isn't in the spec. The client is supposed to assume the stream is closed immediately after sending the close packet.
Just wanted to note that on this GitHub thread
Another note: WispServerCpp doesn't actually close any sockets that it forwards, violating spec, so the remote server never sees the close from the wisp client. Edit: This can also result in sending packets to sockets that were closed according to the client, emulating half-closed behavior without actually closing anything.
I think it's worth looking into making a protocol extension for half closing, @ading2210 @r58Playz
@ProgrammerIn-wonderland
Are you saying the other wisp servers don't support closing or don't support half closing? Sorry I think I'm misunderstanding
Sorry about the confusion. I meant to say the other WISP servers don't support closing, but it turned out I was wrong about that.
Also close is 0x04, the reason is 0x02. Just wanted to clarify because I stated it incorrectly above in my own message
Understood. By 0x02 I always meant reason 0x02 (Voluntary stream closure).
@r58Playz
On epoxy, you can use RUST_LOG=trace to see that it does close streams (stream uuid f9f5dcdb-5edc-40d2-9ba7-b7686b6cfc19 disconnected for client id "127.0.0.1:46658")
Thanks, I enabled tracing and indeed, I see the same message. Sorry about my confusion, I was wrong when I said it wasn't supported by epoxy-tls/wisp-js, they indeed do.
My latest commit is a slight change in strategy with this close issue.
I now distingush between on_shutdown() (FIN) and on_close() (RST), this is the right way to model it in fake_network.js.
I implemented both on_shutdown() and on_close() for WISP using a CLOSE frame with reason 0x02. This does not implement half-closed TCP connections (currently not supported), but this also doesn't leak idle TCP connections.
It would be trivial to change on_shutdown() in case WISP support changes in the future.
Thanks! This does make retrofitting it migh easier if the server advertises support for TCP half closing.
Currently there are two different fake DNS methods in V86:
192.168.87.1
for any given DNS namecloudflare-dns.com
unless specified in config.doh_serverFetch-based networking uses the first method, wisp-based the second, and I'm not sure why.
Faking DNS with DoH works just fine for both fetch- and wisp-based networking, and it returns real results.
So I would like to move the DoH-code from wisp_network.js into handle_fake_dns() (fake_network.js) and make it the default fake DNS method.
Question: Is the first method without DoH still needed or can it be removed? I think it is not needed, only fetch-based networking uses it, and with fetch() this DNS method doesn't really matter (the browser performs the DNS lookup for the URL passed to fetch()).
In case I should keep the first method I would suggest to add a new option config.dns_method
next to config.doh_server
with possible values static
and doh
(default doh
), unless someone has a better idea.
Question: Is the first method without DoH still needed or can it be removed? I think it is not needed, only fetch-based networking uses it, and with fetch() this DNS method doesn't really matter (the browser performs the DNS lookup for the URL passed to fetch()).
For fetch-based networking, there is not much need for it: https://github.com/copy/v86/pull/1061#issuecomment-2184188633, and fetch with IP address works buggy, especially if HTTPS-only mode is enabled in browser.
Fetch networking uses the Host: header so the IP address given is irrelevant, to save time it just returns a dummy IP.
Since wisp actually connects to the IP it has to do the real resolution
Sorry just saw your command supermaxusa, my Internet is glitching
I don't think it makes much sense to add Doh to fetch personally since it's not really necessary based on how it operates (unless you're running a SUPER OLD pre Host header http client). But I do think the Doh server should be configurable
@SuperMaxusa @ProgrammerIn-wonderland
Thanks a lot for your input!
We all agree that DoH doesn't make much sense with fetch-based networking, my point was that it also doesn't hurt because it matters so little. But it's fine, I'll leave it at that.
In my latest commit I moved the DoH-code from WispNetworkAdapter.send() into new function handle_fake_dns_doh()
in fake_network.js, and only WispNetworkAdapter uses DoH, just as before. So it's only a refactoring, the code behaves unchanged. Everything is in place in case someone wants to add a config option for this, I don't see the point.
The benefit of all this is that all of the code in WispNetworkAdapter.send() that was duplicated from fake_network.js is now gone which somewhat simplifies reasoning about the fake networking control flow (it made me scratch my head quite a little).
Thanks! This looks good to merge to me now. Let me know if there's another change you'd like to get in. I will try to rebase (to get rid of the merge commits).
Thanks! This looks good to merge to me now. Let me know if there's another change you'd like to get in. I will try to rebase (to get rid of the merge commits).
Yes, let's merge it.
I see two things left to do, but I want to split them off into a new PR in a couple of weeks.
I will do other things in the next week or two, but I think then I'll take another go at this, if you don't mind.
Great work! A huge improvement over my hacked up proof of concept.
Great work! A huge improvement over my hacked up proof of concept.
My pleasure! And it was just a bit rough around the edges, the core was rock-solid.
Merged manually, thanks all!
@copy I am very sorry but I just noticed that I broke the cors_proxy
function in FetchNetworkAdapter. Only a single line needs to be fixed in fetch_network.js, is a quick PR ok?
@chschnell Sure
Several improvements in fake network code that deal with data flowing towards the guest OS (i.e. package encoding and downstream buffering) and TCP state mechanics.