Closed gridhead closed 4 months ago
Speaking for the sheer unpredictability of the error here, I just happened to set the log level from INFO
to DEBUG
and that somehow made the transfer work just fine. Of course, I would not treat that as a fix but rather as a workaround but still this progresses us ahead by a bit.
In the above picture, the top left part is the sending client, the top right part is the receiving client, the bottom left part is the broker server and the bottom right part is a watch command to check just how far is the transmission progressing, according to codebase changes made.
If you wrote the loop that sends the file in 640 bytes chunks in the obvious way, there's a good chance that you're running into the problem described here: https://websockets.readthedocs.io/en/stable/faq/asyncio.html#why-does-my-program-never-receive-any-messages
@aaugustin I do have a couple of functions (or coroutines? My asyncio
game is admittedly weak) that perform sustained sock.send()
and sock.recv()
. Take a look at the following snippets, for instance.
The receiving end gets put on a sustained receiving mode until all the chunks are not received.
async def collect_contents(sock: WebSocketClientProtocol, pack: bytes = b"") -> bool:
fuse_file(pack)
with logging_redirect_tqdm():
with tqdm(total=standard.client_filesize, unit="B", unit_scale=True, unit_divisor=1024, leave=False, initial=len(pack)) as prog:
for indx in range(standard.client_chks - 1):
mesgcont = await sock.recv()
if isinstance(mesgcont, bytes):
fuse_file(mesgcont)
prog.update(len(mesgcont))
# general(f"Collecting file contents from {standard.client_endo} - (SHA256 {sha256(pack).hexdigest()})")
return True
The sending end gets put on a sustained sending mode until all the chunks are not sent.
async def deliver_contents(sock: WebSocketClientProtocol) -> bool:
with logging_redirect_tqdm():
with tqdm(total=standard.client_filesize, unit="B", unit_scale=True, unit_divisor=1024, leave=False, initial=0) as prog:
for indx in range(0, len(standard.client_bind) - 1):
bite = read_file(standard.client_bind[indx], standard.client_bind[indx + 1])
# general(f"Delivering file contents to {standard.client_endo} (SHA256 {sha256(bite).hexdigest()}).")
prog.update(standard.client_bind[indx + 1] - standard.client_bind[indx])
await sock.send(bite)
return True
So, a couple of questions.
await asyncio.sleep(0)
on the sending end enough, or should I also consider having the same at the receiving end?In practice, I've only encountered this problem on the send
side. (Prompted by your comment, I've been thinking briefly about whether it could happen on the recv
side; it turns out that I don't know; it would require uncommon circumstances. No one complained about that kind of issue in a recv
loop in the 10+ years this library has existed so I'm not too worried.)
The performance penalty shouldn't be measurable in your scenario. Running an iteration of the event loop is extremely fast. I remember benchmarking that extensively; the answer was always "not something you need to think about". Sending data requires a send(2)
syscall which will be much slower than just going to the event loop and back.
itIf you're concerned about performance, I would recommend a larger chunk size than 640 bytes — that's extremely small by modern standards.
For now, I would keep the asyncio.sleep(0)
part on both ends and rigorously test to see under what circumstances I face the same issue. That is, of course, if the problem occurs again. Following your advice, I have increased the chunk size to 1536 bytes but I would be interested to know your thoughts on what an optimal chunk size should be for one transfer occurrence and how big of a chunk size can I send before I start seeing errors?
I would be interested to know your thoughts on what an optimal chunk size should be for one transfer occurrence and how big of a chunk size can I send before I start seeing errors?
On a decent modern network probably it doesn't matter. I'd go for 16kB to 64kB — probably anything between 4kB and 256kB would be indistinguishable — and let the TCP/IP layer do its job. It's better at optimizing network throughput than us :-)
If you expect usage on poor networks, mobile, etc. AND if you have a good resume strategy, then you could go for a smaller size to increase the chances that a chunk go through before the connection breaks; and then you can resume.
While I plan to include resumability to address weaker connections, I am sticking to the vanilla "If it fails, just try again from the start" kind of transfer. For a file that is approximately 993 MiB in size and a server that is run on the same host as where the data is sent from and received, these are the numbers that I get to see.
Chunk size | Duration | Improvement |
---|---|---|
1 | 286.17 | 0 |
2 | 154.3 | -46.0810008 |
4 | 96.81 | -37.25858717 |
8 | 68.12 | -29.63536825 |
16 | 54.87 | -19.45096888 |
32 | 48.52 | -11.57280846 |
64 | 44.18 | -8.944765045 |
128 | 43.76 | -0.9506564056 |
256 | 42.5 | -2.879341865 |
512 | 42.36 | -0.3294117647 |
768 | 42.28 | -0.1888574127 |
1024 | FAIL | FAIL |
Until I can devise a resuming strategy (with a checkpoint write to a temporary directory every after a certain count of chunks are received), I think I would like to keep the range of chunk sizes from 64KiB to 768KiB. I am considering keeping 256KiB as the default chunk size for those users who do not want to tweak things.
Sounds sensible.
I am creating a file transfer service, the code of which can be found here. While transferring file contents from a sender client to a receiver client, the connection breaks if the transfer is sustained for a long period of time. Please note that I am fitting a maximum of 640 bytes in a message using chunking of file contents. Instead of the making one big transmission, I have doing multiple smaller ones.
Any help is appreciated.
Here is a screenshot of the logs that are produced per send and receive activity performed at all ends.
After getting rid of the logs, this is what it looks like. The receiving end seems to get stuck but maybe, it is the absence of message from the sending end due to either a problem at the sending end or at the broker end that leads to it happening.