Open mcuee opened 3 months ago
What do you mean by timeout here?
It will exit by itself after some idling time.
With any error or non-zero process status?
Ler me see if I can capture the exit status.
This is what I do.
root@debian12ct2:~# iperf3 -s &
[1] 863
root@debian12ct2:~# -----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
root@debian12ct2:~# ./crusader serve &
[2] 864
root@debian12ct2:~# Server running...
root@debian12ct2:~# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 1.1 101952 11648 ? Ss Mar26 0:00 /sbin/init
root 49 0.0 1.1 32952 12024 ? Ss Mar26 0:00 /lib/systemd/systemd-journald
systemd+ 90 0.0 0.8 17860 8704 ? Ss Mar26 0:00 /lib/systemd/systemd-networkd
root 101 0.0 0.1 3600 2048 ? Ss Mar26 0:00 /usr/sbin/cron -f
message+ 102 0.0 0.4 9132 4736 ? Ss Mar26 0:00 /usr/bin/dbus-daemon --system --address=systemd: --no
root 105 0.0 0.7 17156 7936 ? Ss Mar26 0:00 /lib/systemd/systemd-logind
root 151 0.0 0.1 2516 1536 pts/0 Ss+ Mar26 0:00 /sbin/agetty -o -p -- \u --noclear --keep-baud - 1152
root 152 0.0 0.3 6120 3968 pts/1 Ss Mar26 0:00 /bin/login -p --
root 153 0.0 0.1 2516 1536 pts/2 Ss+ Mar26 0:00 /sbin/agetty -o -p -- \u --noclear - linux
root 154 0.0 0.8 15408 9344 ? Ss Mar26 0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startu
root 297 0.0 0.4 42652 4632 ? Ss Mar26 0:00 /usr/lib/postfix/sbin/master -w
postfix 299 0.0 0.6 43088 6784 ? S Mar26 0:00 qmgr -l -t unix -u
root 353 0.0 0.4 4980 4224 pts/1 S+ Mar26 0:00 -bash
postfix 855 0.0 0.6 43040 7040 ? S 21:49 0:00 pickup -l -t unix -u -c
root 863 0.0 0.3 8192 3840 pts/1 S 22:59 0:00 iperf3 -s
root 864 0.0 0.2 10452 2976 pts/1 Sl 22:59 0:01 ./crusader serve
root 871 0.0 1.0 17840 11136 ? Ss 23:11 0:00 sshd: root@pts/3
root 874 0.0 0.9 18700 10112 ? Ss 23:11 0:00 /lib/systemd/systemd --user
root 875 0.0 0.4 103012 4560 ? S 23:11 0:00 (sd-pam)
root 893 0.0 0.3 4976 4096 pts/3 Ss 23:11 0:00 -bash
root 898 0.0 0.3 8088 4096 pts/3 R+ 23:11 0:00 ps aux
root@debian12ct2:~# date
Wed Mar 27 23:12:15 UTC 2024
It is working now.
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.6
Connected to server 192.168.38.6:35481
Latency to server 2.70 ms
Testing download...
Testing upload...
Testing both download and upload...
Warning: Load termination timed out. There may be residual untracked traffic in the background.
Writing data...
Saved raw data as data 2024.03.28 07-58-57.crr
Saved plot as plot 2024.03.28 07-58-57.png
Then the next day iperf3
process will still be there but crusader
process is gone.
Note: I am located in Singapore (GMT+8). Now it is in the morning. I will check again this evening and tomorrow morning to see when the process is gone.
Same for OpenWRT, I was running crusader and iperf3 at the same time yesterday and now crusader process is gone.
root@OpenWrt:~# ps | grep iperf3
4031 root 1144 S grep iperf3
15514 root 972 S iperf3 -s
root@OpenWrt:~# ps | grep crusader
4045 root 1148 S grep crusader
root@OpenWrt:~#
.```
Note: I am located in Singapore (GMT+8). Now it is in the morning. I will check again this evening and tomorrow morning to see when the process is gone
Still alive after 10 hours.
root@debian12ct2:~# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 1.1 101952 11648 ? Ss Mar26 0:00 /sbin/init
root 49 0.0 1.1 32952 12280 ? Ss Mar26 0:00 /lib/systemd/systemd-journald
systemd+ 90 0.0 0.8 17860 8704 ? Ss Mar26 0:00 /lib/systemd/systemd-networkd
root 101 0.0 0.1 3600 2048 ? Ss Mar26 0:00 /usr/sbin/cron -f
message+ 102 0.0 0.4 9132 4736 ? Ss Mar26 0:00 /usr/bin/dbus-daemon --system --address=systemd: --no
root 105 0.0 0.7 17164 7936 ? Ss Mar26 0:00 /lib/systemd/systemd-logind
root 151 0.0 0.1 2516 1536 pts/0 Ss+ Mar26 0:00 /sbin/agetty -o -p -- \u --noclear --keep-baud - 1152
root 152 0.0 0.3 6120 3968 pts/1 Ss Mar26 0:00 /bin/login -p --
root 153 0.0 0.1 2516 1536 pts/2 Ss+ Mar26 0:00 /sbin/agetty -o -p -- \u --noclear - linux
root 154 0.0 0.8 15408 9344 ? Ss Mar26 0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startu
root 297 0.0 0.4 42652 4632 ? Ss Mar26 0:00 /usr/lib/postfix/sbin/master -w
postfix 299 0.0 0.6 43088 6784 ? S Mar26 0:00 qmgr -l -t unix -u
root 353 0.0 0.4 4980 4224 pts/1 S+ Mar26 0:00 -bash
root 863 0.0 0.3 8192 3840 pts/1 S Mar27 0:00 iperf3 -s
root 864 0.0 0.2 10464 2960 pts/1 Sl Mar27 0:02 ./crusader serve
postfix 1156 0.0 0.6 43040 7040 ? S 09:30 0:00 pickup -l -t unix -u -c
root 1161 0.0 1.0 17840 11136 ? Ss 10:46 0:00 sshd: root@pts/3
root 1164 0.0 0.9 18700 9984 ? Ss 10:47 0:00 /lib/systemd/systemd --user
root 1165 0.0 0.4 103012 4572 ? S 10:47 0:00 (sd-pam)
root 1183 0.0 0.3 4976 4096 pts/3 Ss 10:47 0:00 -bash
root 1186 0.0 0.3 8088 4096 pts/3 R+ 10:47 0:00 ps aux
root@debian12ct2:~# date
Thu Mar 28 10:47:20 UTC 2024
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.6
Connected to server 192.168.38.6:35481
Latency to server 2.61 ms
Testing download...
Testing upload...
Testing both download and upload...
Warning: Server overload detected during test. Result should be discarded.
Warning: Load termination timed out. There may be residual untracked traffic in the background.
Writing data...
Saved raw data as data 2024.03.28 18-48-21.crr
Saved plot as plot 2024.03.28 18-48-21.png
Let me monitor in the next two days to see if this is a false alarm or not. Thanks.
I started the crusader server process under OpenWRT 23.05 a few hours ago as well, But then it seems to be dead after I logged in to check the status (OK) and then exit (GONE).
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> ssh root@192.168.38.1
root@192.168.38.1's password:
BusyBox v1.36.1 (2023-11-14 13:38:11 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 23.05.2, r23630-842932a63d
-----------------------------------------------------
root@OpenWrt:~# date
Thu Mar 28 14:31:24 UTC 2024
root@OpenWrt:~# ps | grep iperf3
4446 root 1144 R grep iperf3
15514 root 972 S iperf3 -s
root@OpenWrt:~# ps | grep crusader
4468 root 1148 R grep crusader
6200 root 10432 S ./crusader serve
root@OpenWrt:~# exit
Connection to 192.168.38.1 closed.
I ran the test on the client side and the test failed.
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.1
Connected to server 192.168.38.1:35481
thread 'tokio-runtime-worker' panicked at crusader-lib\src\test.rs:791:71:
called `Result::unwrap()` on an `Err` value: "Expected object"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.1
thread 'main' panicked at crusader-lib\src\test.rs:1318:10:
called `Result::unwrap()` on an `Err` value: Os { code: 10061, kind: ConnectionRefused, message: "No connection could be made because the target machine actively refused it." }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> ssh root@192.168.38.1
Then I logged in again, the process was gone.
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> ssh root@192.168.38.1
root@192.168.38.1's password:
BusyBox v1.36.1 (2023-11-14 13:38:11 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 23.05.2, r23630-842932a63d
-----------------------------------------------------
root@OpenWrt:~# date
Thu Mar 28 14:39:24 UTC 2024
root@OpenWrt:~# ps | grep iperf3
6281 root 1144 S grep iperf3
15514 root 972 S iperf3 -s
root@OpenWrt:~# ps | grep crusader
6289 root 1148 S grep crusader
No issues with Debian 12 LxC container.
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> ssh root@192.168.38.6
root@192.168.38.6's password:
Linux debian12ct2 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Thu Mar 28 10:47:01 2024 from 192.168.38.120
root@debian12ct2:~# ps aux | grep iperf3
root 863 0.0 0.3 8192 3840 pts/1 S Mar27 0:00 iperf3 -s
root 1238 0.0 0.1 3324 1536 pts/3 S+ 14:44 0:00 grep iperf3
root@debian12ct2:~# ps aux | grep crusader
root 864 0.0 0.2 10500 2996 pts/1 Sl Mar27 0:03 ./crusader serve
root 1240 0.0 0.1 3324 1536 pts/3 S+ 14:44 0:00 grep crusader
root@debian12ct2:~# exit
logout
Connection to 192.168.38.6 closed.
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.6
Connected to server 192.168.38.6:35481
Latency to server 2.42 ms
Testing download...
Testing upload...
Testing both download and upload...
Warning: Load termination timed out. There may be residual untracked traffic in the background.
Writing data...
Saved raw data as data 2024.03.28 22-45-28.crr
Saved plot as plot 2024.03.28 22-45-28.png
Okay, the issue is really only under OpenWRT.
Step to reproduce: Initially there are no issues.
OpenWRT side:
root@OpenWrt:~# ./crusader serve &
root@OpenWrt:~# Server running...
root@OpenWrt:~# ps | grep crusader
9484 root 10432 S ./crusader serve
9546 root 1148 R grep crusader
root@OpenWrt:~# Serving 192.168.38.120:3930, version 3
Serving complete for 192.168.38.120:3930
root@OpenWrt:~# exit
Connection to 192.168.38.1 closed.
Windows client side:
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.1
Connected to server 192.168.38.1:35481
Latency to server 2.76 ms
Testing download...
Testing upload...
Testing both download and upload...
Writing data...
Saved raw data as data 2024.03.29 11-21-47.crr
Saved plot as plot 2024.03.29 11-21-47.png
PS C:\work\speedtest> ssh root@192.168.38.1
root@192.168.38.1's password:
BusyBox v1.36.1 (2023-11-14 13:38:11 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 23.05.2, r23630-842932a63d
-----------------------------------------------------
root@OpenWrt:~# ps | grep crusader
9484 root 10516 S ./crusader serve
9716 root 1148 R grep crusader
Then I ran the test from Windows client again and it failed. From OpenWRT side, we can see the server process is gone.
Windows client side
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.1
Connected to server 192.168.38.1:35481
thread 'tokio-runtime-worker' panicked at crusader-lib\src\test.rs:791:71:
called `Result::unwrap()` on an `Err` value: "Expected object"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
root@OpenWrt:~# ps | grep crusader
9756 root 1148 S grep crusader
Debug log:
PS C:\work\speedtest\crusader-x86_64-pc-windows-msvc> .\crusader test 192.168.38.1
Connected to server 192.168.38.1:35481
thread 'tokio-runtime-worker' panicked at crusader-lib\src\test.rs:791:71:
called `Result::unwrap()` on an `Err` value: "Expected object"
stack backtrace:
0: 0x7ff7a538ec83 - <unknown>
1: 0x7ff7a53a9eed - <unknown>
2: 0x7ff7a538b7d1 - <unknown>
3: 0x7ff7a538ea8a - <unknown>
4: 0x7ff7a5390b79 - <unknown>
5: 0x7ff7a539083b - <unknown>
6: 0x7ff7a5391064 - <unknown>
7: 0x7ff7a5390f35 - <unknown>
8: 0x7ff7a538f319 - <unknown>
9: 0x7ff7a5390c44 - <unknown>
10: 0x7ff7a53d2277 - <unknown>
11: 0x7ff7a53d2733 - <unknown>
12: 0x7ff7a52a64e4 - <unknown>
13: 0x7ff7a52d66ad - <unknown>
14: 0x7ff7a5330531 - <unknown>
15: 0x7ff7a532faee - <unknown>
16: 0x7ff7a532f169 - <unknown>
17: 0x7ff7a532b6ed - <unknown>
18: 0x7ff7a5333941 - <unknown>
19: 0x7ff7a5325adb - <unknown>
20: 0x7ff7a531eaef - <unknown>
21: 0x7ff7a531ef75 - <unknown>
22: 0x7ff7a539592c - <unknown>
23: 0x7ff842eb257d - BaseThreadInitThunk
24: 0x7ff844d8aa48 - RtlUserThreadStart
@Zoxc
I have changed the title to reflect the issue I can consistently reproduce. I am using OpenWRT 23.05 x86_64 VM on Proxmox PVE 8.0.
This seems to be similar to the resolved issue.
@richb-hanover Just wondering if you can test the server on OpenWRT and see if you can reproduce the issue or not. Thanks.
The step to reproduce: must run crusader server in backgroud and must exit OpenWRT log-in once. Before the exit, everything works fine. After the exit, first run will fail and crusader server process will exit. No traces on the OpenWRT side (server side).
PS C:\work\speedtest> ssh root@192.168.38.1
root@192.168.38.1's password:
BusyBox v1.36.1 (2023-11-14 13:38:11 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 23.05.2, r23630-842932a63d
-----------------------------------------------------
root@OpenWrt:~# ./crusader serve &
root@OpenWrt:~# Server running...
Serving 192.168.38.120:4422, version 3
Serving complete for 192.168.38.120:4422
Serving 192.168.38.120:4491, version 3
Serving complete for 192.168.38.120:4491
root@OpenWrt:~# Serving 192.168.38.120:4624, version 3
Serving complete for 192.168.38.120:4624
root@OpenWrt:~# exit
Connection to 192.168.38.1 closed.
PS C:\work\speedtest> ssh root@192.168.38.1
root@192.168.38.1's password:
BusyBox v1.36.1 (2023-11-14 13:38:11 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt 23.05.2, r23630-842932a63d
-----------------------------------------------------
root@OpenWrt:~# ps | grep crusader
12012 root 10568 S ./crusader serve
12439 root 1148 R grep crusader
(note: client crashed)
root@OpenWrt:~# ps | grep crusader
12500 root 1148 R grep crusader
root@OpenWrt:~# exit
Connection to 192.168.38.1 closed.
Detailed client side debug log from debug version of crusader.
For the Debian/Ubuntu container, occassionally I can also recreate the issue after a while.
mcuee@mcuees-Mac-mini crusader-aarch64-apple-darwin % ./crusader test 192.168.50.15
Connected to server 192.168.50.15:35481
thread 'main' panicked at crusader-lib/src/test.rs:1318:10:
called `Result::unwrap()` on an `Err` value: "Expected object"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
zsh: abort ./crusader test 192.168.50.15
Once this happens, the server process will exit.
@richb-hanover Just wondering if you can test the server on OpenWRT and see if you can reproduce the issue or not. Thanks.
Sorry, I don't have a platform for testing crusader server on OpenWrt. Hopefully, you'll get clues from the Debian/Ubuntu container...
It seems to me crusader server can timeout easily after being idle, unlike iperf3.
Is this a known behavior? If yes, how do I workaround the behavior?
FYI, so far I have only run the server from Linux side (Debian 12 LxC container or OpenWRT 23.5).