Closed toncho11 closed 3 years ago
Thank you @pawosm-arm -
are you saying ping w/big packets work in your case (ping from a recent Linux version that is)? (with the previous version of the driver).
—Mellvik
- nov. 2020 kl. 18:12 skrev pawosm-arm notifications@github.com:
@Mellvik https://github.com/Mellvik I've just tried wd.c above. It does not break anything, it does not change anything, yet in my case, the driver always worked, so I can only confirm that at least there's no regression.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733117095, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOHITD3ZIG7W22M2FTLSRPSPVANCNFSM4T5KRKKA.
How big is big? The biggest ping that returns is 1449, e.g.:
$ ping -i 0.4 -s 1449 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data.
1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=39.1 ms
1457 bytes from 192.168.1.44: icmp_seq=2 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=6 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=15 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=19 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=23 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=24 ttl=64 time=29.0 ms
^C
--- 192.168.1.44 ping statistics ---
24 packets transmitted, 20 received, 16.6667% packet loss, time 9213ms
rtt min/avg/max/mdev = 29.036/29.722/39.122/2.158 ms
The result is similarly the same with and without your patch (I specifically have rebuilt ELKS without your patch in order to make sure), and I still can telnet some other host from ELKS while being pinged like this. The ELKS console is not disrupted by any error messages at the same time.
and how recent is recent? The machine from which I was sending those pings is:
$ uname -a
Linux 5.4.72-gentoo-x86_64 #1 SMP Mon Oct 19 18:08:12 BST 2020 x86_64 Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz GenuineIntel GNU/Linux
$ ping -V
ping from iputils s20190709
The ping problem does not produce error messages on ELKS, only on the sending Linux machine.
Either way, I didn't observe error messages on any side.
@pawosm-arm,
You have a very high packet loss, can you check if the packet loss is the same in both cases? you should have zero packet loss. That may be the explanation.
It may week be that my fix doesn't work yet!!
—M
- nov. 2020 kl. 19:30 skrev pawosm-arm notifications@github.com:
Either way, I didn't observe error messages on any side.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733157680, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOES4GRSU2LXMKNTNYTSRP3SZANCNFSM4T5KRKKA.
You have a very high packet loss.
I've noticed that. Sadly, it's the same with or without the patch.
I also tested if making the time interval larger could help. Nope, the results still look the same. Without patch:
$ ping -i 3.5 -s 1449 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data.
1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=2 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=6 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=15 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=19 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=23 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=25 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=34 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=35 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=36 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=37 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=38 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=39 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=40 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=42 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=43 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=44 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=46 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=47 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=48 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=50 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=51 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=52 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=53 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=54 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=55 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=56 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=57 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=59 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=60 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=61 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=63 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=64 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=65 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=67 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=68 ttl=64 time=29.1 ms
^C
--- 192.168.1.44 ping statistics ---
68 packets transmitted, 56 received, 17.6471% packet loss, time 234716ms
rtt min/avg/max/mdev = 28.958/29.224/29.413/0.114 ms
With patch:
$ ping -i 3.5 -s 1449 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data.
1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=2 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=6 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=15 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=19 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=23 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=24 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=28 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=32 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.4 ms
^C
--- 192.168.1.44 ping statistics ---
33 packets transmitted, 27 received, 18.1818% packet loss, time 112098ms
rtt min/avg/max/mdev = 29.033/29.264/29.393/0.102 ms
@pawosm-arm Maybe you also need to leave it a bit longer icmp_seq over 60 for example. My examples are after 38.
@pawosm-arm Maybe you also need to leave it a bit longer icmp_seq over 60 for example. My examples are after 38.
It makes the score worse, e.g. with the patch, interval 0.5s:
--- 192.168.1.44 ping statistics ---
60 packets transmitted, 47 received, 21.6667% packet loss, time 29544ms
rtt min/avg/max/mdev = 28.555/29.200/29.396/0.146 ms
In my case without patch I do not get packet loss, but I get wrong return values.
Keep in mind, this ELKS machine is slow, it may not be able to respond on time, so it could be accounted as a timeout (and furthermore, as a lack of response).
@pawsm-arm, this is very interesting indeed. The packet loss may be because the machine is slow, so skip the -i option for now. Also, my patch may not work yet, @toncho11 have you tested the patch?
The thing is, as far as I can see, there is no way you can have reliable file transfers (packets exceeding one NIC page) with the current driver. Telnet will work fine (outgoing telnet from elks and long listings such as ls -lR should trigger the error).
@pawosm-arm, is there a chance you can run @toncho11's urlget script?
-M
- nov. 2020 kl. 20:44 skrev toncho11 notifications@github.com: In my case without patch I do not get packet loss, but I get wrong return values.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@pawosm-arm, is there a chance you can run @toncho11's urlget script?
...with the correction of protocol (I don't have any ftp server, yet I have an http server). I've tried to download gzipped ELKS disk image and I observed three TCP checksum errors. The urlget
command did complete eventually, so there's some auto-repair capability, yet the problem itself becomes apparent.
Actually, @pawosm-arm - could you try a ping -s 600? It's worth noting that there are situations with only even numbered packets that may run forever without the errpr becoming visible.
Also, I just noticed - you're running ping interval 3.5 secs, so there is no way you should have packet loss. It seems likely that the loss is related to the problem we're discussing. And it's really weird that you're getting different results than @toncho11.
So, let's see what the new tests bring to the table.
--M
- nov. 2020 kl. 20:50 skrev pawosm-arm notifications@github.com: Keep in mind, this ELKS machine is slow, it may not be able to respond on time, so it could be accounted as a timeout (and furthermore, as a lack of response).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
$ ping -i 0.4 -s 600 -c 60 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 600(628) bytes of data
608 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=15.9 ms
...
--- 192.168.1.44 ping statistics ---
60 packets transmitted, 56 received, 6.66667% packet loss, time 23644ms
rtt min/avg/max/mdev = 15.136/15.403/21.872/0.879 ms
$ ping -i 0.4 -s 592 -c 60 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 592(620) bytes of data.
600 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=15.3 ms
...
--- 192.168.1.44 ping statistics ---
60 packets transmitted, 56 received, 6.66667% packet loss, time 23630ms
rtt min/avg/max/mdev = 15.004/15.166/15.393/0.118 ms
$ ping -i 0.4 -s 572 -c 60 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 572(600) bytes of data.
580 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=14.8 ms
...
--- 192.168.1.44 ping statistics ---
60 packets transmitted, 56 received, 6.66667% packet loss, time 23653ms
rtt min/avg/max/mdev = 14.690/14.879/15.073/0.124 ms
$ ping -i 3.5 -s 572 -c 20 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 572(600) bytes of data.
580 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=14.9 ms
...
--- 192.168.1.44 ping statistics ---
20 packets transmitted, 19 received, 5% packet loss, time 66564ms
rtt min/avg/max/mdev = 14.730/14.842/15.069/0.108 ms
Ok, final test for now - ping with no parameters, running for at least 120 seconds...
Thank you!
-M
- nov. 2020 kl. 21:59 skrev pawosm-arm notifications@github.com:
$ ping -i 0.4 -s 600 -c 60 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 600(628) bytes of data 608 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=15.9 ms ... --- 192.168.1.44 ping statistics --- 60 packets transmitted, 56 received, 6.66667% packet loss, time 23644ms rtt min/avg/max/mdev = 15.136/15.403/21.872/0.879 ms
$ ping -i 0.4 -s 592 -c 60 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 592(620) bytes of data. 600 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=15.3 ms ... --- 192.168.1.44 ping statistics --- 60 packets transmitted, 56 received, 6.66667% packet loss, time 23630ms rtt min/avg/max/mdev = 15.004/15.166/15.393/0.118 ms
$ ping -i 0.4 -s 572 -c 60 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 572(600) bytes of data. 580 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=14.8 ms ... --- 192.168.1.44 ping statistics --- 60 packets transmitted, 56 received, 6.66667% packet loss, time 23653ms rtt min/avg/max/mdev = 14.690/14.879/15.073/0.124 ms
$ ping -i 3.5 -s 572 -c 20 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 572(600) bytes of data. 580 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=14.9 ms ... --- 192.168.1.44 ping statistics --- 20 packets transmitted, 19 received, 5% packet loss, time 66564ms rtt min/avg/max/mdev = 14.730/14.842/15.069/0.108 ms — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
$ timeout -s 2 120s ping 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 56(84) bytes of data.
64 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=6.58 ms
...
64 bytes from 192.168.1.44: icmp_seq=120 ttl=64 time=6.47 ms
--- 192.168.1.44 ping statistics ---
120 packets transmitted, 120 received, 0% packet loss, time 119169ms
rtt min/avg/max/mdev = 6.193/6.377/6.580/0.127 ms
Interesting. No losses.
Thank you, @pawosm-arm, Interesting indeed - or just confirming our suspicion: For some reason you're seeing packet loss when @toncho11 is seeing checksum errors, but the reason is the same.
Doing the math on buffer size and wraparound @ different packet sizes and comparing with the loss numbers you've reported is the ultimate confirmation.
There is a bug in my fix, I'm sharing an update shortly, appreciate if you can test.
--Mellvik
- nov. 2020 kl. 22:31 skrev pawosm-arm notifications@github.com:
$ timeout -s 2 120s ping 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 56(84) bytes of data. 64 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=6.58 ms ... 64 bytes from 192.168.1.44: icmp_seq=120 ttl=64 time=6.47 ms
--- 192.168.1.44 ping statistics --- 120 packets transmitted, 120 received, 0% packet loss, time 119169ms rtt min/avg/max/mdev = 6.193/6.377/6.580/0.127 ms Interesting. No loses.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@toncho11, @pawosm-arm - here's the updated fix, possibly the ultimate fix :-), please test:
(only one line accepted - 286)
} else { /* handle wrap-around */
size_t len1 = ((WD_STOP_PG - this_frame) << 8) - sizeof(e8390_pkt_hdr);
fmemcpyb(data, current->t_regs.ds,
(char *)hdr_start + sizeof(e8390_pkt_hdr), WD_SHMEMSEG, len1);
fmemcpyb(data+len1, current->t_regs.ds,
---> (char *)(WD_FIRST_RX_PG << 8), WD_SHMEMSEG, res-len1); <-- change his line
}
It's much better now! 60 pings of size 1449 returned with no losses, then 70 pings of the same size also returned with no losses, then I did urlget of a large disk image, no TCP checksum errors observed this time, BUT, at the same time this host was infinitely pinged with interval 0.4 and when I interrupted it, some losses were observed:
1457 bytes from 192.168.1.44: icmp_seq=418 ttl=64 time=29.7 ms
^C
--- 192.168.1.44 ping statistics ---
418 packets transmitted, 415 received, 0.717703% packet loss, time 167230ms
rtt min/avg/max/mdev = 29.027/63.236/727.606/107.975 ms, pipe 2
Also, I tried and saw that larger ping can now be sent than before, now the max ping that still returns is 1472 (was 1449 before this updated patch):
$ ping -i 0.4 -s 1472 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1472(1500) bytes of data.
1480 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.6 ms
1480 bytes from 192.168.1.44: icmp_seq=2 ttl=64 time=29.5 ms
1480 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=6 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.7 ms
1480 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.5 ms
1480 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.5 ms
1480 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.5 ms
1480 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.7 ms
^C
--- 192.168.1.44 ping statistics ---
12 packets transmitted, 12 received, 0% packet loss, time 4411ms
rtt min/avg/max/mdev = 29.459/29.607/29.725/0.102 ms
So generally, it is better now.
Great, @pawosm-arm - then it seems like the wrap-around bug in the driver has been fixed. I'll post a PR for that.
As to packet loss, if you bang heavily on ELKS it will loose packets, presumably you got messages about buffer overruns on the console? if not, we may have to take a look at that too!
Then there is the other issue: Why you didn't get checksum errors on elks. I would very much like to figure that one out so we don't have a hidden bug lurching somewhere. Can you create and post a tcpdump file without the latest fix, capturing the packets when doing a ping with big packets and frequent lost packets, standard interval (1s).
It may seem like elks is just discarding the packets which it shouldn't - and doesn't in @toncho11's case. @toncho11 - if you're there and have a chance to test, it would be very useful!!
Thank you!
—Mellvik
- nov. 2020 kl. 10:51 skrev pawosm-arm notifications@github.com:
It's much better now! 60 pings of size 1449 returned with no losses, then 70 pings of the same size also returned with no losses, then I did urlget of a large disk image, no TCP checksum errors observed this time, BUT, at the same time this host was infinitely pinged with interval 0.4 and when I interrupted it, some losses were observed:
1457 bytes from 192.168.1.44: icmp_seq=418 ttl=64 time=29.7 ms ^C --- 192.168.1.44 ping statistics --- 418 packets transmitted, 415 received, 0.717703% packet loss, time 167230ms rtt min/avg/max/mdev = 29.027/63.236/727.606/107.975 ms, pipe 2 Also, I tried and saw that larger ping can now be sent than before, now the max ping that still returns is 1472 (was 1449 before this updated patch):
$ ping -i 0.4 -s 1472 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 1472(1500) bytes of data. 1480 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.6 ms 1480 bytes from 192.168.1.44: icmp_seq=2 ttl=64 time=29.5 ms 1480 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=6 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.7 ms 1480 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.5 ms 1480 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.5 ms 1480 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.5 ms 1480 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.7 ms ^C --- 192.168.1.44 ping statistics --- 12 packets transmitted, 12 received, 0% packet loss, time 4411ms rtt min/avg/max/mdev = 29.459/29.607/29.725/0.102 ms So generally, it is better now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733597286, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOERHSKUN2E42A3MUFTSRTHRFANCNFSM4T5KRKKA.
As to packet loss, if you bang heavily on ELKS it will loose packets, presumably you got messages about buffer overruns on the console? if not, we may have to take a look at that too!
Indeed, there were retrans messages.
Can you create and post a tcpdump file without the latest fix
I'll get back to this later today.
@pawosm-arm,
Did you see any eth: overflow
messages?
-M
Did you see any eth: overflow messages?
I don't recall seeing any. A few of retrans messages popped out, that's all I remember.
Can you create and post a tcpdump file without the latest fix
tcpdump -v
exposes too many details of the local network here! I guess some of the relevant lines are (pinging ELKS with your patch reverted):
11:40:40.205119 IP (tos 0x0, ttl 64, id 30654, offset 0, flags [DF], proto ICMP (1), length 1477)
cortex > 192.168.1.44: ICMP echo request, id 20, seq 1, length 1457
11:40:40.244767 IP (tos 0x0, ttl 64, id 213, offset 0, flags [none], proto ICMP (1), length 1477)
192.168.1.44 > cortex: ICMP echo reply, id 20, seq 1, length 1457
11:40:40.606268 IP (tos 0x0, ttl 64, id 30655, offset 0, flags [DF], proto ICMP (1), length 1477)
cortex > 192.168.1.44: ICMP echo request, id 20, seq 2, length 1457
11:40:40.635512 IP (tos 0x0, ttl 64, id 214, offset 0, flags [none], proto ICMP (1), length 1477)
192.168.1.44 > cortex: ICMP echo reply, id 20, seq 2, length 1457
OK, what I need is all ICMP traffic between the two addresses during the time where you experience lost packets.
In order not to expose anything from the network (actually, it's useful to know that there are other nodes on the same segment - there will be extra traffic (broadcast, ARP in particular) to handle for ELKS), you can ask tcpdump to filter on ICMP traffic.
tcpdump -v -w file icmp
or just filter on node address tcpdump -v -w file host 192.168.1.44
—mellvik
- nov. 2020 kl. 12:48 skrev pawosm-arm notifications@github.com:
Did you see any eth: overflow messages?
I don't recall seeing any. A few of retrans messages popped out, that's all I remember.
Can you create and post a tcpdump file without the latest fix
tcpdump -v exposes too many details of the local network here! I guess some of the relevant lines are:
11:40:40.205119 IP (tos 0x0, ttl 64, id 30654, offset 0, flags [DF], proto ICMP (1), length 1477) cortex > 192.168.1.44: ICMP echo request, id 20, seq 1, length 1457 11:40:40.244767 IP (tos 0x0, ttl 64, id 213, offset 0, flags [none], proto ICMP (1), length 1477) 192.168.1.44 > cortex: ICMP echo reply, id 20, seq 1, length 1457 11:40:40.606268 IP (tos 0x0, ttl 64, id 30655, offset 0, flags [DF], proto ICMP (1), length 1477) cortex > 192.168.1.44: ICMP echo request, id 20, seq 2, length 1457 11:40:40.635512 IP (tos 0x0, ttl 64, id 214, offset 0, flags [none], proto ICMP (1), length 1477) 192.168.1.44 > cortex: ICMP echo reply, id 20, seq 2, length 1457 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733658919, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOAGDTD5NBYAK57LIQTSRTVHVANCNFSM4T5KRKKA.
or just filter on node address tcpdump -v -w file host 192.168.1.44
I did. I had to gzip it though so github could allow me to send it.
Thanks! perfect.
You got no lost packets on this one, right? (there are no traces of loss in the dump).
Any chance you can create a similar with losses?
—M
--
- nov. 2020 kl. 13:45 skrev pawosm-arm notifications@github.com:
or just filter on node address tcpdump -v -w file host 192.168.1.44
I did. I had to gzip it though so github could allow me to send it.
pings.dump.gz https://github.com/jbruchon/elks/files/5596708/pings.dump.gz — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733684872, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOEPE4Q4K5ND5N53B4DSRT36LANCNFSM4T5KRKKA.
that's weird. Fortunately, the terminal window with those pings wasn't cleared yet:
$ ping -i 0.4 -s 1449 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data.
1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.9 ms
1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=24 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=25 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=34 ttl=64 time=29.4 ms
1457 bytes from 192.168.1.44: icmp_seq=35 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=37 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=38 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=39 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=40 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=42 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=43 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=44 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=46 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=47 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=48 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=50 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=51 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=52 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=53 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=55 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=56 ttl=64 time=29.1 ms
1457 bytes from 192.168.1.44: icmp_seq=57 ttl=64 time=29.0 ms
1457 bytes from 192.168.1.44: icmp_seq=59 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=60 ttl=64 time=29.2 ms
1457 bytes from 192.168.1.44: icmp_seq=61 ttl=64 time=29.3 ms
1457 bytes from 192.168.1.44: icmp_seq=63 ttl=64 time=29.2 ms
^C
--- 192.168.1.44 ping statistics ---
64 packets transmitted, 49 received, 23.4375% packet loss, time 25240ms
rtt min/avg/max/mdev = 29.036/29.228/29.913/0.145 ms
~23% of packet loss as usual. No ideas why tcpdump
can't see it. Maybe the packet count is OK, but the interior isn't. There are number of ping
command providers in Linux, mine is from iputils
, I suspect it works slightly different than the one @toncho11 used...
Thank you @pawosm-arm, that was very enlightening.
Now we know that all packets are coming back from elks, and elks is off the hook in that regard. What's still a mystery is why you're not getting checksum errors on ELKS.
It's also really weird that your ping doesn't report the errors, but apparently discards the packets instead. AFAIK most versions of Linux are using iputils for the basic IP utilities including ping - I certainly am (just checked), and I believe @toncho11 is too.
Just to make sure we've been through all alternatives, could you run exactly the same, just add the -v option to ping?
-Mellvik
- nov. 2020 kl. 15:56 skrev pawosm-arm notifications@github.com:
that's weird. Fortunately, the terminal window with those pings wasn't cleared yet:
$ ping -i 0.4 -s 1449 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data. 1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.9 ms 1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.4 ms 1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=11 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.4 ms 1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=24 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=25 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.4 ms 1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.4 ms 1457 bytes from 192.168.1.44: icmp_seq=34 ttl=64 time=29.4 ms 1457 bytes from 192.168.1.44: icmp_seq=35 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=37 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=38 ttl=64 time=29.0 ms 1457 bytes from 192.168.1.44: icmp_seq=39 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=40 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=42 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=43 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=44 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=46 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=47 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=48 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=50 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=51 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=52 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=53 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=55 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=56 ttl=64 time=29.1 ms 1457 bytes from 192.168.1.44: icmp_seq=57 ttl=64 time=29.0 ms 1457 bytes from 192.168.1.44: icmp_seq=59 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=60 ttl=64 time=29.2 ms 1457 bytes from 192.168.1.44: icmp_seq=61 ttl=64 time=29.3 ms 1457 bytes from 192.168.1.44: icmp_seq=63 ttl=64 time=29.2 ms ^C --- 192.168.1.44 ping statistics --- 64 packets transmitted, 49 received, 23.4375% packet loss, time 25240ms rtt min/avg/max/mdev = 29.036/29.228/29.913/0.145 ms ~23% of packet loss as usual. No ideas why tcpdump can't see it. Maybe the packet count is OK, but the interior isn't. There are number of ping command providers in Linux, mine is from iputils, I suspect it works slightly different than the one @toncho11 https://github.com/toncho11 used...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733756330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOAO5OLXZJHKRAOCXW3SRULJ5ANCNFSM4T5KRKKA.
I use Ubuntu 16 (or 18). Checksum errors are only when using: urlget ftp://user:pass@192.168.1.34:21/image/fd360.bin > /root/fd360_1.bin not ping. I need the whole file to test later otherwise it will be "Did you test with this line or this line?". Thanks.
Thanks for reminding me, @toncho11 - I forgot. The checksum test is tcp-level!
I can get you the whole file, although I'm sure you can edit the single line that has changed?
—Mellvik
- nov. 2020 kl. 16:57 skrev toncho11 notifications@github.com:
I use Ubuntu 16 (or 18). Checksum errors are only when using: urlget ftp://user:pass@192.168.1.34:21/image/fd360.bin > /root/fd360_1.bin not ping. I need the whole file to test later otherwise it will be "Did you test with this line or this line?". Thanks.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733793840, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOGXGT7K6PZW5DMIVV3SRUSM5ANCNFSM4T5KRKKA.
Just to make sure we've been through all alternatives, could you run exactly the same, just add the -v option to ping?
Sadly, the -v
option does not add anything. So I tried -A
(adaptive interval) and -D
(print timestamps):
$ ping -v -D -A -s 1449 192.168.1.44
PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data.
[1606320070.337461] 1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.3 ms
[1606320070.738513] 1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.3 ms
[1606320070.939034] 1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.3 ms
[1606320071.139596] 1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.3 ms
[1606320071.540650] 1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.3 ms
[1606320071.741222] 1457 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.3 ms
[1606320071.941831] 1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.4 ms
[1606320072.142455] 1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.4 ms
[1606320072.543396] 1457 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.1 ms
[1606320072.743728] 1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.1 ms
[1606320072.944048] 1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.1 ms
[1606320073.344663] 1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.1 ms
[1606320073.544972] 1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.1 ms
[1606320073.745259] 1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.0 ms
[1606320074.145914] 1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.1 ms
[1606320074.346245] 1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.1 ms
[1606320074.546588] 1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.1 ms
[1606320074.747089] 1457 bytes from 192.168.1.44: icmp_seq=23 ttl=64 time=29.2 ms
[1606320075.148055] 1457 bytes from 192.168.1.44: icmp_seq=25 ttl=64 time=29.3 ms
[1606320075.348578] 1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.3 ms
[1606320075.549118] 1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.3 ms
[1606320075.950036] 1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.2 ms
[1606320076.150592] 1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.3 ms
[1606320076.351109] 1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.3 ms
[1606320076.752126] 1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.3 ms
[1606320076.952656] 1457 bytes from 192.168.1.44: icmp_seq=34 ttl=64 time=29.3 ms
[1606320077.153239] 1457 bytes from 192.168.1.44: icmp_seq=35 ttl=64 time=29.3 ms
[1606320077.353619] 1457 bytes from 192.168.1.44: icmp_seq=36 ttl=64 time=29.1 ms
[1606320077.553978] 1457 bytes from 192.168.1.44: icmp_seq=37 ttl=64 time=29.1 ms
[1606320077.754338] 1457 bytes from 192.168.1.44: icmp_seq=38 ttl=64 time=29.1 ms
[1606320077.954714] 1457 bytes from 192.168.1.44: icmp_seq=39 ttl=64 time=29.1 ms
[1606320078.154999] 1457 bytes from 192.168.1.44: icmp_seq=40 ttl=64 time=29.0 ms
[1606320078.555574] 1457 bytes from 192.168.1.44: icmp_seq=42 ttl=64 time=29.0 ms
[1606320078.755895] 1457 bytes from 192.168.1.44: icmp_seq=43 ttl=64 time=29.1 ms
[1606320078.956230] 1457 bytes from 192.168.1.44: icmp_seq=44 ttl=64 time=29.1 ms
[1606320079.356792] 1457 bytes from 192.168.1.44: icmp_seq=46 ttl=64 time=29.0 ms
[1606320079.557321] 1457 bytes from 192.168.1.44: icmp_seq=47 ttl=64 time=29.3 ms
[1606320079.757824] 1457 bytes from 192.168.1.44: icmp_seq=48 ttl=64 time=29.2 ms
[1606320079.958293] 1457 bytes from 192.168.1.44: icmp_seq=49 ttl=64 time=29.2 ms
[1606320080.158820] 1457 bytes from 192.168.1.44: icmp_seq=50 ttl=64 time=29.3 ms
[1606320080.359277] 1457 bytes from 192.168.1.44: icmp_seq=51 ttl=64 time=29.2 ms
[1606320080.559737] 1457 bytes from 192.168.1.44: icmp_seq=52 ttl=64 time=29.2 ms
[1606320080.760242] 1457 bytes from 192.168.1.44: icmp_seq=53 ttl=64 time=29.3 ms
[1606320081.161200] 1457 bytes from 192.168.1.44: icmp_seq=55 ttl=64 time=29.3 ms
[1606320081.361751] 1457 bytes from 192.168.1.44: icmp_seq=56 ttl=64 time=29.3 ms
[1606320081.562313] 1457 bytes from 192.168.1.44: icmp_seq=57 ttl=64 time=29.3 ms
[1606320081.963337] 1457 bytes from 192.168.1.44: icmp_seq=59 ttl=64 time=29.3 ms
[1606320082.163663] 1457 bytes from 192.168.1.44: icmp_seq=60 ttl=64 time=29.1 ms
[1606320082.363955] 1457 bytes from 192.168.1.44: icmp_seq=61 ttl=64 time=29.0 ms
[1606320082.564279] 1457 bytes from 192.168.1.44: icmp_seq=62 ttl=64 time=29.1 ms
[1606320082.764564] 1457 bytes from 192.168.1.44: icmp_seq=63 ttl=64 time=29.0 ms
[1606320082.964924] 1457 bytes from 192.168.1.44: icmp_seq=64 ttl=64 time=29.1 ms
[1606320083.165219] 1457 bytes from 192.168.1.44: icmp_seq=65 ttl=64 time=29.0 ms
^C
--- 192.168.1.44 ping statistics ---
65 packets transmitted, 53 received, 18.4615% packet loss, time 12828ms
rtt min/avg/max/mdev = 29.029/29.182/29.373/0.109 ms, ipg/ewma 200.438/29.135 ms
OK @pawosm-arm, I really appreciate your efforts. This one remains a mystery.
If you'd like to put some icing on the cake, you could put in the patched up driver and run @toncho11's urlget script. That would be very helpful for our QA.
—Mellvik
- nov. 2020 kl. 17:03 skrev pawosm-arm notifications@github.com:
Just to make sure we've been through all alternatives, could you run exactly the same, just add the -v option to ping?
Sadly, the -v option does not add anything. So I tried -A (adaptive interval) and -D (print timestamps):
$ ping -v -D -A -s 1449 192.168.1.44 PING 192.168.1.44 (192.168.1.44) 1449(1477) bytes of data. [1606320070.337461] 1457 bytes from 192.168.1.44: icmp_seq=1 ttl=64 time=29.3 ms [1606320070.738513] 1457 bytes from 192.168.1.44: icmp_seq=3 ttl=64 time=29.3 ms [1606320070.939034] 1457 bytes from 192.168.1.44: icmp_seq=4 ttl=64 time=29.3 ms [1606320071.139596] 1457 bytes from 192.168.1.44: icmp_seq=5 ttl=64 time=29.3 ms [1606320071.540650] 1457 bytes from 192.168.1.44: icmp_seq=7 ttl=64 time=29.3 ms [1606320071.741222] 1457 bytes from 192.168.1.44: icmp_seq=8 ttl=64 time=29.3 ms [1606320071.941831] 1457 bytes from 192.168.1.44: icmp_seq=9 ttl=64 time=29.4 ms [1606320072.142455] 1457 bytes from 192.168.1.44: icmp_seq=10 ttl=64 time=29.4 ms [1606320072.543396] 1457 bytes from 192.168.1.44: icmp_seq=12 ttl=64 time=29.1 ms [1606320072.743728] 1457 bytes from 192.168.1.44: icmp_seq=13 ttl=64 time=29.1 ms [1606320072.944048] 1457 bytes from 192.168.1.44: icmp_seq=14 ttl=64 time=29.1 ms [1606320073.344663] 1457 bytes from 192.168.1.44: icmp_seq=16 ttl=64 time=29.1 ms [1606320073.544972] 1457 bytes from 192.168.1.44: icmp_seq=17 ttl=64 time=29.1 ms [1606320073.745259] 1457 bytes from 192.168.1.44: icmp_seq=18 ttl=64 time=29.0 ms [1606320074.145914] 1457 bytes from 192.168.1.44: icmp_seq=20 ttl=64 time=29.1 ms [1606320074.346245] 1457 bytes from 192.168.1.44: icmp_seq=21 ttl=64 time=29.1 ms [1606320074.546588] 1457 bytes from 192.168.1.44: icmp_seq=22 ttl=64 time=29.1 ms [1606320074.747089] 1457 bytes from 192.168.1.44: icmp_seq=23 ttl=64 time=29.2 ms [1606320075.148055] 1457 bytes from 192.168.1.44: icmp_seq=25 ttl=64 time=29.3 ms [1606320075.348578] 1457 bytes from 192.168.1.44: icmp_seq=26 ttl=64 time=29.3 ms [1606320075.549118] 1457 bytes from 192.168.1.44: icmp_seq=27 ttl=64 time=29.3 ms [1606320075.950036] 1457 bytes from 192.168.1.44: icmp_seq=29 ttl=64 time=29.2 ms [1606320076.150592] 1457 bytes from 192.168.1.44: icmp_seq=30 ttl=64 time=29.3 ms [1606320076.351109] 1457 bytes from 192.168.1.44: icmp_seq=31 ttl=64 time=29.3 ms [1606320076.752126] 1457 bytes from 192.168.1.44: icmp_seq=33 ttl=64 time=29.3 ms [1606320076.952656] 1457 bytes from 192.168.1.44: icmp_seq=34 ttl=64 time=29.3 ms [1606320077.153239] 1457 bytes from 192.168.1.44: icmp_seq=35 ttl=64 time=29.3 ms [1606320077.353619] 1457 bytes from 192.168.1.44: icmp_seq=36 ttl=64 time=29.1 ms [1606320077.553978] 1457 bytes from 192.168.1.44: icmp_seq=37 ttl=64 time=29.1 ms [1606320077.754338] 1457 bytes from 192.168.1.44: icmp_seq=38 ttl=64 time=29.1 ms [1606320077.954714] 1457 bytes from 192.168.1.44: icmp_seq=39 ttl=64 time=29.1 ms [1606320078.154999] 1457 bytes from 192.168.1.44: icmp_seq=40 ttl=64 time=29.0 ms [1606320078.555574] 1457 bytes from 192.168.1.44: icmp_seq=42 ttl=64 time=29.0 ms [1606320078.755895] 1457 bytes from 192.168.1.44: icmp_seq=43 ttl=64 time=29.1 ms [1606320078.956230] 1457 bytes from 192.168.1.44: icmp_seq=44 ttl=64 time=29.1 ms [1606320079.356792] 1457 bytes from 192.168.1.44: icmp_seq=46 ttl=64 time=29.0 ms [1606320079.557321] 1457 bytes from 192.168.1.44: icmp_seq=47 ttl=64 time=29.3 ms [1606320079.757824] 1457 bytes from 192.168.1.44: icmp_seq=48 ttl=64 time=29.2 ms [1606320079.958293] 1457 bytes from 192.168.1.44: icmp_seq=49 ttl=64 time=29.2 ms [1606320080.158820] 1457 bytes from 192.168.1.44: icmp_seq=50 ttl=64 time=29.3 ms [1606320080.359277] 1457 bytes from 192.168.1.44: icmp_seq=51 ttl=64 time=29.2 ms [1606320080.559737] 1457 bytes from 192.168.1.44: icmp_seq=52 ttl=64 time=29.2 ms [1606320080.760242] 1457 bytes from 192.168.1.44: icmp_seq=53 ttl=64 time=29.3 ms [1606320081.161200] 1457 bytes from 192.168.1.44: icmp_seq=55 ttl=64 time=29.3 ms [1606320081.361751] 1457 bytes from 192.168.1.44: icmp_seq=56 ttl=64 time=29.3 ms [1606320081.562313] 1457 bytes from 192.168.1.44: icmp_seq=57 ttl=64 time=29.3 ms [1606320081.963337] 1457 bytes from 192.168.1.44: icmp_seq=59 ttl=64 time=29.3 ms [1606320082.163663] 1457 bytes from 192.168.1.44: icmp_seq=60 ttl=64 time=29.1 ms [1606320082.363955] 1457 bytes from 192.168.1.44: icmp_seq=61 ttl=64 time=29.0 ms [1606320082.564279] 1457 bytes from 192.168.1.44: icmp_seq=62 ttl=64 time=29.1 ms [1606320082.764564] 1457 bytes from 192.168.1.44: icmp_seq=63 ttl=64 time=29.0 ms [1606320082.964924] 1457 bytes from 192.168.1.44: icmp_seq=64 ttl=64 time=29.1 ms [1606320083.165219] 1457 bytes from 192.168.1.44: icmp_seq=65 ttl=64 time=29.0 ms ^C --- 192.168.1.44 ping statistics --- 65 packets transmitted, 53 received, 18.4615% packet loss, time 12828ms rtt min/avg/max/mdev = 29.029/29.182/29.373/0.109 ms, ipg/ewma 200.438/29.135 ms — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbruchon/elks/issues/877#issuecomment-733798071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3WGOGEEXBKUDLHMLS7L73SRUTFXANCNFSM4T5KRKKA.
If the network cable is unplugged ktcp does not tell you. Next urlget craches everything.
I managed to download only one time in 2h ... and there were no errors.
It was just a terrible experience ... I think I hit several partition. fat errors, kilo is too slow to the point of unusable, cp fails but it does not say why ... The sys script does not work @ghaerr ... It can not copy files to a presumably empty partition. It does copy linux, but not the commands from /bin and /etc/. Is the /mnt/bin the right destination ?
If the network cable is unplugged ktcp does not tell you. Next urlget craches everything.
If you press ctrl+c
while urlget
is fetching a large file (even if there are no errors reported), bad things will happen afterwards and soon ktcp
gets stuck.
I'm running following script on ELKS built with the latest wd.zip
:
#!/bin/sh
for i in 1 2 3 4 5 6 7 8 9 10
do
urlget http://one_of_my_ip_addresses/elks/32MB.img.bz2 > img-$i.bz2
done
ls -la img-*.bz2
for i in 1 2 3 4 5 6 7 8 9 10
do
sum img-$i.bz2
done
bzipped 32MB.img is 348047 bytes long, yet it still takes time to download. And on 8086 computing checksum also takes time. I'll let you know when it's done.
It worked fine, no overruns, no TCP checksum errors. Standard output redirected to a file:
-rw-rw-rw- 1 root root 348047 Jan 16 05:54 img-1.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:13 img-10.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 05:57 img-2.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 05:59 img-3.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:01 img-4.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:03 img-5.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:05 img-6.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:07 img-7.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:09 img-8.bz2
-rw-rw-rw- 1 root root 348047 Jan 16 06:11 img-9.bz2
42514 680
42514 680
42514 680
42514 680
42514 680
42514 680
42514 680
42514 680
42514 680
42514 680
Thank you, @pawosm-arm, Great testing, perfect result. We can thus close that part of the issue. Actually, afaik this issue should now be closed. We've discussed the usefulness of the remaining kernel messages and found them useful. The method to avoid them is to switch to a different 'console'.
@toncho11, there is no way to detect a disconnected cable on old hardware, thus the consequences of doing so are undefined. I'm not surprised by your experience - you're hitting robustness issues that have not been addressed yet. You are mentioning a number of different potential problems in your message, it would be useful if you can address them one by one, describe the preconditionds, the intended and expected outcome and the problem. That way we can address - or possibly explain - each one. Thank you for you testing efforts. Your help with the networking issues has been invaluable. These activities are very important for the development and continuous improvement of ELKS.
--Mellvik
- nov. 2020 kl. 20:26 skrev pawosm-arm notifications@github.com: It worked fine, no overruns, no TCP checksum errors. Standard output redirected to a file:
-rw-rw-rw- 1 root root 348047 Jan 16 05:54 img-1.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:13 img-10.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 05:57 img-2.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 05:59 img-3.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:01 img-4.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:03 img-5.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:05 img-6.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:07 img-7.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:09 img-8.bz2 -rw-rw-rw- 1 root root 348047 Jan 16 06:11 img-9.bz2 42514 680 42514 680 42514 680 42514 680 42514 680 42514 680 42514 680 42514 680 42514 680 42514 680 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@Mellvik
Currently I am unable to test your fix with several download images. I will do this later or maybe even next week.
Ok @toncho11, Looking forward to it. In the meanwhile - if you think the ktcp messages issue has been adequately addressed, please close the issue. We can still continue communications on this thread, or if desirable, open a new one.
Thank you. -Mellvik
- nov. 2020 kl. 21:03 skrev toncho11 notifications@github.com:
@Mellvik Currently I am unable to test your fix with several download images. I will do this later or maybe even next week.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hello @toncho11,
It was just a terrible experience ... I think I hit several partition. fat errors, kilo is too slow to the point of unusable,
Well, although ELKS has come a long way, there are some times when it can seem like nothing works. It seems you just hit one of them. I need more information on the partition or FAT errors, in order to fix them, if they are repeatable. It sounds like kilo, which was written for 32-bit Linux mostly as a showpiece of how small a visual editor source code could be, needs quite a bit of horsepower. I ported it - only to find that it also uses tons of memory for small files.
cp fails but it does not say why ... The sys script does not work @ghaerr ... It can not copy files to a presumably empty partition.
Can you elaborate on cp? This could also be the problem with the sys script, since sys all worked last time you ran it.
It does copy linux, but not the commands from /bin and /etc/. Is the /mnt/bin the right destination ?
Sys will definitely fail without warning if you have a partition mounted on /mnt, that still isn't fixed. Makeboot uses /tmp/mnt but sys still uses /mnt to mount the device passed. Is that the problem?
I left a "set -x" commented out on line 7 of /bin/sys for just this reason. Uncomment and it should show more about what its doing. Unfortunately, much can scroll off the screen; running it from a serial login on a remote terminal helps for debugging.
Thank you!
I will report later on the WD ring buffer fix.
I do confirm that there are no more "bad checksum" error messages and no ping errors.
Good job! :)
ktcp prints error messages all the time. This actually prevents me from working with ELKS, because I am interrupted in the middle of a command.
example:
tcp: Refusing packet ... :9564 -> 1025