Nukesor / pueue

:stars: Manage your shell commands.
MIT License
4.68k stars 128 forks source link

[Bug] Sometimes pueued does not respond to the pueue client #541

Open Shihira opened 3 weeks ago

Shihira commented 3 weeks ago

Describe the bug

Sometimes pueued does not respond to the pueue status. That is to say the pueue client hangs forever waiting for the daemon. Commands like pueue add, pueue clean continues to work, especially that pueue clean can sometimes get things recovered. I guess this was because the message from the daemon were too long or be truncated?

Steps to reproduce

No 100% reproducible, usually happens when several tasks were added in the same time.

Debug logs (if relevant)

16:16:01 [INFO] Parsing config files
16:16:01 [INFO] Checking path: "C:\\Users\\clairfeng\\AppData\\Roaming\\pueue\\pueue.yml"
16:16:01 [INFO] Found config file at: "C:\\Users\\clairfeng\\AppData\\Roaming\\pueue\\pueue.yml"
16:16:01 [DEBUG] (1) rustls::client::hs: No cached session for DnsName("pueue.local")
16:16:01 [DEBUG] (1) rustls::client::hs: Not resuming any session
16:16:01 [DEBUG] (1) rustls::client::hs: Using ciphersuite TLS13_AES_256_GCM_SHA384
16:16:01 [DEBUG] (1) rustls::client::tls13: Not resuming
16:16:01 [DEBUG] (1) rustls::client::tls13: TLS1.3 encrypted extensions: [ServerNameAck]
16:16:01 [DEBUG] (1) rustls::client::hs: ALPN protocol is None
16:16:01 [DEBUG] (1) pueue_lib::network::protocol: Sending message: Status
16:16:01 [DEBUG] (1) rustls::server::hs: decided upon suite TLS13_AES_256_GCM_SHA384
16:16:01 [DEBUG] (2) pueue_lib::network::protocol: Received message: Status
16:16:01 [DEBUG] (2) pueue_lib::network::protocol: Sending message: StatusResponse(
...
)

Although the server had sended the message, the client never seemed to have recieved it.

Operating system

Windows 10

Pueue version

v3.4.0

Additional context

No response

Nukesor commented 3 weeks ago

Phew, that could be anything.

This will need somebody with a windows machine to look into this issue :)

It would be great if anybody that also runs into this issue could take a look!

Shihira commented 2 weeks ago

I printed some log when sending and receiving bytes. In my case the server sent 170334 bytes but the client received only 161280 bytes, the tail was missing for some strange reasons, especially when the connection was a localhost TCP so there should not have been any network stability problems. I changed the PACKET_SIZE from 1280 to 64K and the problem seemed to have disappeared for now, but I am not sure if this is a proper fix.

Nukesor commented 2 weeks ago

This is odd. I didn't expect this to be a networking issue, let alone an MTU issue. Though maybe that's a red herring and the issue just doesn't appear as there're less frames that're sent.

Using large frames will lead to issues in most networks and with payloads that're bigger than 64k. We had problems with those in the past, which is why a very conservative MTU of 1280 has been chosen.

It looks like some packets are lost, but since you're using TCP that really shouldn't be a problem...

I tried to reproduce your problem and connected my local client to my server via TCP (via a wireguard transport layer) and it worked just fine. This makes it tricky for me to debug though, as I really cannot do any analysis as long as I cannot reproduce the issue :<

Nukesor commented 2 weeks ago

One more question. What exactly do you mean by "especially that pueue clean can sometimes get things recovered."?

Nukesor commented 1 week ago

Ping @Shihira