ghaerr / elks

Embeddable Linux Kernel Subset - Linux for 8086
Other
1.01k stars 108 forks source link

ktcp status #610

Closed Mellvik closed 4 years ago

Mellvik commented 4 years ago

I believe it makes sense to open a new issue on ktcp: It has not been tested in a while, and much has changed in and around the kernel since.

I fired up KTCP on a physical machine this morning, and the overall status is: It does not work. Then the good news:

This is from ELKS:

# ps
  PID   GRP  TTY USER STAT CSEG DSEG  HEAP   FREE   SIZE COMMAND
    1     0      root    S 3e46 3ff0  3072  13252  26800 /bin/init 
    8     8    1 root    S 4e93 5030     0  16275  27344 /bin/getty /dev/tty1 
    9     9   S0 root    S 5540 611d  2193  13914  75648 -/bin/sh 
   10     9   S0 root    S 67b8 6bca  6144  10045  44832 ktcp 10.0.2.15 /dev/eth 
   23    23      root    S 49ca 72aa     0  16253  26864 httpd 
   21    21      root    S 4b6c 44d1     0  16251  25408 telnetd 
   24     9   S0 root    R 4ca7 7ec0  1025  14054  26672 ps 
# telnet 10.0.2.1
Connecting to 10.0.2.1 port 23

This is from the pinging host:

From 10.0.2.1 icmp_seq=67 Destination Host Unreachable
From 10.0.2.1 icmp_seq=68 Destination Host Unreachable
From 10.0.2.1 icmp_seq=69 Destination Host Unreachable
From 10.0.2.1 icmp_seq=70 Destination Host Unreachable
64 bytes from 10.0.2.15: icmp_seq=41 ttl=64 time=34171 ms
64 bytes from 10.0.2.15: icmp_seq=42 ttl=64 time=33151 ms
64 bytes from 10.0.2.15: icmp_seq=43 ttl=64 time=32128 ms
64 bytes from 10.0.2.15: icmp_seq=71 ttl=64 time=3096 ms
64 bytes from 10.0.2.15: icmp_seq=72 ttl=64 time=2069 ms
64 bytes from 10.0.2.15: icmp_seq=73 ttl=64 time=1042 ms
64 bytes from 10.0.2.15: icmp_seq=74 ttl=64 time=54.2 ms
From 10.0.2.1 icmp_seq=109 Destination Host Unreachable
From 10.0.2.1 icmp_seq=110 Destination Host Unreachable
From 10.0.2.1 icmp_seq=111 Destination Host Unreachable
From 10.0.2.1 icmp_seq=112 Destination Host Unreachable
From 10.0.2.1 icmp_seq=113 Destination Host Unreachable
From 10.0.2.1 icmp_seq=114 Destination Host Unreachable
From 10.0.2.1 icmp_seq=115 Destination Host Unreachable
From 10.0.2.1 icmp_seq=116 Destination Host Unreachable
From 10.0.2.1 icmp_seq=117 Destination Host Unreachable
From 10.0.2.1 icmp_seq=118 Destination Host Unreachable
From 10.0.2.1 icmp_seq=119 Destination Host Unreachable
From 10.0.2.1 icmp_seq=120 Destination Host Unreachable
From 10.0.2.1 icmp_seq=121 Destination Host Unreachable
From 10.0.2.1 icmp_seq=122 Destination Host Unreachable
From 10.0.2.1 icmp_seq=123 Destination Host Unreachable
From 10.0.2.1 icmp_seq=124 Destination Host Unreachable
From 10.0.2.1 icmp_seq=125 Destination Host Unreachable
From 10.0.2.1 icmp_seq=126 Destination Host Unreachable
64 bytes from 10.0.2.15: icmp_seq=75 ttl=64 time=54498 ms
64 bytes from 10.0.2.15: icmp_seq=76 ttl=64 time=53465 ms
64 bytes from 10.0.2.15: icmp_seq=77 ttl=64 time=52430 ms
64 bytes from 10.0.2.15: icmp_seq=78 ttl=64 time=51394 ms
64 bytes from 10.0.2.15: icmp_seq=79 ttl=64 time=50359 ms
64 bytes from 10.0.2.15: icmp_seq=80 ttl=64 time=49323 ms
64 bytes from 10.0.2.15: icmp_seq=106 ttl=64 time=22287 ms
64 bytes from 10.0.2.15: icmp_seq=107 ttl=64 time=21253 ms
64 bytes from 10.0.2.15: icmp_seq=108 ttl=64 time=20218 ms
64 bytes from 10.0.2.15: icmp_seq=127 ttl=64 time=487 ms
From 10.0.2.1 icmp_seq=162 Destination Host Unreachable
From 10.0.2.1 icmp_seq=163 Destination Host Unreachable
From 10.0.2.1 icmp_seq=164 Destination Host Unreachable
From 10.0.2.1 icmp_seq=165 Destination Host Unreachable
Mellvik commented 4 years ago

Ping from the outside works - but only 16 times. Then ktcp dies and needs to be restarted. Kill works.

Lets start here at the basics - there is something wrong with either your particular test system with the IRQ issues, or in ktcp. Ktcp has no UDP support, very little IP support, and literally only a few lines of ICMP support - to return an ECHO request. No memory is allocated or released by any of these routines in ktcp (except in the convoluted ktcp<->sockets code), so this should be easily debugged. I will post a PR later today that will provide complete ARP and IP packet incoming and outoing printing support, which should allow us to see a complete log on serial console, which can then be analyzed or forwarded.

To me, ping (and thus basic ICMP) is the level one 'acceptance test' for a connection. The proof that IP works, which obviously is the carrier for whatever comes above. That's why I keep banging on ping. 'When ping works we have a connection.' There may still be buffer problems and other issues on the interface level, that don't show until we start transferring larffer packets, but a stable ping generally (IMHO) still means the hardware level is OK. Looking at your test log, it seems to me that the basic networking is not working, which I now have working on my system (it did not work initially at all). I'm thinking that there is a problem at the ethernet/link level.

That may well be the case. But the same hardware works fine with dos and MTCP, and I'm switching back and forth between 2 interfaces. From my perspective we have either a timing/clock/lost interrupt problem or somehow the packlets are 'damaged' on their way from the i/f to memory. Still, it's confusing that it works on QEMU. BTW - when telnetting in to elks, we do indeed have a twoway connection, since we get all the way to /bin/login. Further, httpd works fine - however simple. Thus my conclusion about thinking you should test on another system or on another IRQ.

That's on the agenda for tomorrow.

--Mellvik

ghaerr commented 4 years ago

But the same hardware works fine with dos and MTCP, and I'm switching back and forth between 2 interfaces.

Good to know. I was thinking not necessarily that the hardware had problems, but possibly that ELKS interrupt handling for 8259 EOI, or the ethernet driver (or even the XT IRQ 1 interrupt) could be handled improperly. Given that IRQ 1 and 9 are acting strangely, I'm suspicious of 8259 related issues. It is unclear whether ELKS has ever had IRQ > 7 working properly. Do you know of any device that runs well with an IRQ > 8 on ELKS? I don't think there are any! So we may not be even handling the 8259 chained-interrupt properly. That's why I suggest moving your card down to an IRQ < 7, like 5. (make sure you turn off COM3 in ports.h if you do that and don't have a second com card installed).

From my perspective we have either a timing/clock/lost interrupt problem

Agreed.

or somehow the packlets are 'damaged' on their way from the i/f to memory.

Could be, as well as the ethernet driver may not be able to handle any semblance of faster traffic. This should be easily sorted out by observing every packet, from ARP to IP to ICMP, which I plan on providing debugging for, for ping.

Still, it's confusing that it works on QEMU.

No - sadly there are just all sorts of bugs in lots of places. It's taken a lot just to get telnet and telnetd networking running on QEMU, where the link layers are emulated. I think now some better working upper layers will allow us to debug the lower layers, where there are more problems.

BTW - when telnetting in to elks, we do indeed have a twoway connection, since we get all the way to /bin/login.

Oh, I know - both telnet and telnetd were previously quite broken, and would not work at all talking to/from OS X. Strangely enough, they seemed to work on other's systems, but I am guessing those systems did not have RFC-compliant implementations, as finding the memchr bug showed that telnet never negotiated anything, which caused it to fail immediately on OSX. I now have both working great, full duplex, simultaneously from both host and ELKS, on QEMU. TCP, after cleaning up the runaway memory allocations from an unbounded retransmit queue, now is working much better, although there are still problems, which we'll get into later. It is possible these memory problems previously went unnoticed, on systems that never dropped a packet.

The TCP implementation of ktcp is actually fairly nicely done - I'm comparing its implementation to a known working finite state diagram of TCP/IP to try to find what parts might be responsible for known bugs, which mostly include connection tear-down and retransmission at this point.

Further, httpd works fine - however simple.

Yes, it does, however it will only operate 7-8 times before ktcp crashes. I haven't been able to figure that one out yet. Multiple, subsequent telnet connections in either direction aren't working either; I think the connection tear-down is the reason for that.

On another note - netstat is quite useful, but be warned that it uses a special near-TCP-like connection mechanism to ktcp, and when connections aren't working in ktcp, netstat won't work either.

Mellvik commented 4 years ago

Thanks again for the 'runthrough' - @ghaerr, always very useful.

I'll put the wrapup of xt_key.c on hold for a few days in order to expedite the network testing. I ran into some snags with NUMLOCK - the unused, but programmed implementation turned out to be - well, nonfuctional.

It is unclear whether ELKS has ever had IRQ > 7 working properly. Do you know of any device that runs well with an IRQ > 8 on ELKS? I don't think there are any! So we may not be even handling the 8259 chained-interrupt properly. That's why I suggest moving your card down to an IRQ < 7, like 5. This is a very interesting angle indeed. Since yesterday (actually many times before, but stronger now) I've had this nagging feeling that we have an interrupt problem that manifests itself in weird ways on physical (slow) hardware. Not directly related to some specific IRQ, more like a 'ringing (irq) line' or maybe a 'disable interrupts' that lasts too long - causing extra bad performance and lost interrupts. Anyway, not that this suspicion is on the table, it gets attention. I'll include these changes in the ongoing testing.

--Mellvik

Mellvik commented 4 years ago

Not much time for testing today, but here's what I got:

13:51:49.109311 ARP, Request who-has 10.0.2.2 tell 10.0.2.15, length 50 13:51:49.109429 ARP, Reply 10.0.2.2 is-at b8:27:eb:9a:77:bc (oui Unknown), length 28 13:51:49.667190 IP 10.0.2.15.1024 > 10.0.2.2.telnet: Flags [S], seq 3958024376, win 256, options [mss 1024], length 0

It is interesting that the server doesn't respond at all, so I did a telnet from the mac to the raspbian host to get a metric towards the raspbian:

15:48:03.474966 IP 192.168.10.146.54948 > 192.168.10.26.telnet: Flags [S], seq 1739357785, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 1976625981 ecr 0,sackOK,eol], length 0 15:48:03.475160 IP 192.168.10.26.telnet > 192.168.10.146.54948: Flags [S.], seq 4129236338, ack 1739357786, win 28960, options [mss 1460,sackOK,TS val 3426687769 ecr 1976625981,nop,wscale 7], length 0

Apparently the server wants more options ... OR the window advertised by elks is too small. I thought window size should be larger than (possibly a multiple of) MSS?

The following packet trace is from ktcp with ARP_KLUDGE enabled.
Ping from server: The server doesn't recognize the echo replies - just sits waiting till the arp's start and then reports host unreachable. So it seems like the server doesn't accept the ARP replies either.

16:53:08.127160 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 1, length 64 16:53:08.135446 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 1, length 64 16:53:09.167971 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 2, length 64 16:53:09.176325 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 2, length 64 16:53:10.207971 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 3, length 64 16:53:10.216349 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 3, length 64 16:53:11.247953 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 4, length 64 16:53:11.256340 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 4, length 64 16:53:12.287957 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 5, length 64 16:53:12.296342 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 5, length 64 16:53:13.167806 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:13.170243 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:13.327980 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 6, length 64 16:53:13.336320 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 6, length 64 16:53:14.207824 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:14.210327 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:14.367985 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 7, length 64 16:53:14.376350 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 7, length 64 16:53:15.247809 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:15.250264 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:15.407953 IP 10.0.2.2 > 10.0.2.15: ICMP echo request, id 21372, seq 8, length 64 16:53:15.416326 IP 10.0.2.15 > 10.0.2.2: ICMP echo reply, id 21372, seq 8, length 64 16:53:16.447951 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:16.450390 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:17.487830 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:17.490272 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:18.527825 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:18.530261 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:19.568685 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:19.571146 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:20.607832 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:20.610296 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:21.647828 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:21.650281 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:22.688572 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:22.691041 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:23.727837 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:23.730434 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:24.767838 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:24.770291 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:25.808523 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:25.810987 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:26.847823 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:26.850271 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:53:27.887811 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:53:27.890289 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:55:02.435418 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:55:02.437919 ARP, Reply 10.0.2.15 is-at 52:54:00:12:34:56 (oui Unknown), length 50 16:55:03.487859 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:55:04.527831 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:55:05.568002 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:55:06.607826 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28 16:55:07.647921 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28


Hopefully at least somewhat useful. I'll get som more info over the weekend.

--M
ghaerr commented 4 years ago

Hello @Mellvik!

  • Dropping the IRQ down to 5 was unsuccessful - and (possibly) interesting: I was not able to make either interface work. Initial ARP ok,

My take is that, given the known problem with this card/system with IRQ 1, it is going to be very hard to debug higher level TCP. There are just too many variables when packets may be lost. Ping possibly another story, but I'd like to see the results of your Portable 386 system, to see whether the IRQ 1 issue is across multiple systems.

That said, I thought you mentioned you had the card working on another base address. Was that also including another IRQ?

We need a test scenario where we can get a network card running on a lower-than 8 IRQ, unless your 386 system happens to work well on IRQ 9.

I suppose that single packet exchange does not depend on interrupts working.

More likely a single packet was transmitted and IRQ not reenabled or whatever, causing all sorts of other possible problems.

So the testing below is back to basics - irq9/300.

Thanks. I'm not sure whether we can depend on much test data, too much of a wild goose chase considering things work on QEMU and you have definite issues with that system/card/irq on ELKS.

  • Outgoing ARP works, and outgoing telnet proceeds like this:
# telnet 10.0.2.2
Connecting to 10.0.2.2 (202000a) port 23
arp: got reply on timed wait

then hangs - the telnet process is 'unkillable'. Should telnet time out and resend the SYN at this point?

Remember, telnet isn't doing any timeout or SYN processing. It's hung in kernel connect, and ktcp is doing timeout and SYN checking. I will have to check - but I don't think ktcp handles timeout processing on SYN handshake, only on TCP data packets sent with unacknowledged response. This is a good example of how quickly things can go downhill with our current TCP/IP stack. So it could easily be that ktcp never got an SYN response due to your hardware/irq issues and telnet is hung in an uninterrupted connect, (of which changing that is yet another issue).

There literally are just too many variables when packets are lost. At this point, solving all those problems is a low priority - instead focusing on getting even "normal, working" communications working under NO packet loss conditions. This is why we need to "go back to basics" and focus first on getting a network card that works with ELKS, and 2) working on the fact that ping doesn't even work.

Apparently the server wants more options ... OR the window advertised by elks is too small

Hard to say. IMO, during a 3-way handshake - which isn't ever completed with your system and ELKS - the server has two options: SYN back or RST back. And remember that ELKS stack doesn't handle any IP options.

I thought window size should be larger than (possibly a multiple of) MSS?

I don't think so, but I'll have to check. I have only got everything tested on OSX, haven't actually talked with Pi yet, so that's yet another variable.

The following packet trace is from ktcp with ARP_KLUDGE enabled.

The only usefulness with ARP_WAIT_KLUGE at this point is if turning it on makes things work.

Mellvik commented 4 years ago

Ok, @ghaerr, I'll have to digest all this tomorrow. Remember though: There are NO lost packets. The packet trace is not lying. The hardware is fine, it works with DOS.

So it could easily be that ktcp never got an SYN response due to your hardware/irq issues and telnet is hung in an uninterrupted connect, (of which changing that is yet another issue).

No. AFAIK the server never responded. We need to figure out why.

Anyway, I need to investigate the interrupt issue first.

--M

ghaerr commented 4 years ago

No problems @Mellvik. I get a little confused as to which system you're talking about when you say "the server", when there is ELKS, OSX, and Pi in the conversation. I'm also not familiar with the particular packet sniffer and need to see the ELKS debug output to debug quickly, as that output is closer to the ktcp procedure calling sequence.

Agreed your hw is fine, I'm talking about a working ELKS network driver and 8259 interrupt processing w/ELKS. We can't debug high level stuff if low level isn't known.

We need to stay as simple as possible, there's way too many problems for me to track, frankly. I've just identified and fixed two more problems with RST handling.

I think a working network card/driver/irq and ping is where we should start for you, since QEMU runs a single connection both ways fairly well. That means running ktcp with full debug, running ping until it fails, with ktcp debug output and packet sniffer. Only after that is resolved will we make progress on TCP, I think.

I'm only testing a single network connect, transfer, and disconnect. Then reboot, because there are all sorts of additional problems with ktcp after ANY closed/completed connection. It basically doesn't work twice. (Yes, its easier to reboot QEMU than hw each time!)

Thanks again for your testing!

Mellvik commented 4 years ago

I get a little confused as to which system you're talking about when you say "the server", when there is ELKS, OSX, and Pi in the conversation. Communication is hard :-). Like last week about your newly discovered (old) PC and I read your message like 'cannot create bootable floppy' instead of (later apparent) 'PC won't boot from floppy'. Anyway - I use 'server' denoting the other end of a telnet connection, and will make sure to point out when that end is elks. I'm also not familiar with the particular packet sniffer and need to see the ELKS debug output to debug quickly, as that output is closer to the ktcp procedure calling sequence. This is good ol' tcpdump, our reliable network debugging companion since the 80s. Any level of detail down to the last bit if you like. Agreed your hw is fine, I'm talking about a working ELKS network driver and 8259 interrupt processing w/ELKS. We can't debug high level stuff if low level isn't known.

My assessment too. It's like fixing the roof when the foundation is shaky. I think a working network card/driver/irq and ping is where we should start for you, since QEMU runs a single connection both ways fairly well. That means running ktcp with full debug, running ping until it fails, with ktcp debug output and packet sniffer. Only after that is resolved will we make progress on TCP, I think.

Copy that. I'm continuing with the 386 today, where the first observation is that the keyboard interrupt issue seems gone (@IRQ9/300). No extra chars @ the login prompt. Interesting. More later. I'm only testing a single network connect, transfer, and disconnect. Then reboot, because there are all sorts of additional problems with ktcp after ANY closed/completed connection. It basically doesn't work twice. (Yes, its easier to reboot QEMU than hw each time!)

Working with real hw is indeed very different. From compile to booted system is like 5 minutes, login (starting ash/sash) 12-15 seconds etc. That's part of the game, no complaints. BTW, you may find this interesting: ps() has become an invaluable tool at the price of speed: Again, not a complaint, just an observation. Like 1-2 seconds per line/process on the 286. Ps and netstat are now the most frequently used commands at this end.

--Mellvik

Mellvik commented 4 years ago

Update: While tracking (and tracing) interrupts:

Mellvik commented 4 years ago

@ghaerr, I forgot to mention: your latest PR (unmerged) is included.

ghaerr commented 4 years ago

@Mellvik, very interesting.

This confirms my suspicion that something is strange with the hardware on your other PC :) (Yes, I know it works with DOS, but lets just say it's finicky, and needs something additionally done in software to make it work properly on ELKS). I suggest continued testing only on the 386 for the time being.

It also sounds like you are confirming the ability to (finally) remap IRQ and addresses on network cards.... ?

On the IP front, there seem to be a general checksum problem.

Very interesting, as the checksum code was converted from ASM to C during the bcc conversion a at least a couple years ago. This could easily be why ping doesn't work, and possibly everything else when connecting to systems that perhaps ignore bad checksum packets (as they should).

Perhaps I should port the old ASM code, rather than try to debug the new code. It's on my list, I'll just do it.

Is anything connecting either way on the new system? With this new knowledge of bad checksum, I wouldn't be surprised if nothing works. It is very strange that we've not seen this problem before (nor do I on QEMU, unless perhaps QEMU is ignoring checksums and rewriting packets itself).

Thanks for your testing, I think this is a breakthrough!

ghaerr commented 4 years ago

Further inspection shows it may not be the ASM to C conversion, but rather switched addresses at a time after chksum calculation due to the ethernet implementation. I will need to see if macOS can turn on packet debug printing for kernel TCP/IP.

Your trace shows that checksum errors are sometimes present in IP, always present in ICMP, and sometimes present in TCP also, do you agree?

There is also the possibility that I broke the ntohs macro and its manifesting itself with a windowsize > 256. Perhaps trying changing ntohs in line 5 of elks/libc/include/linuxmt/arpa/inet.h back to:

#define ntohs(x)    ( (((x) >> 8) & 0xff) | ((x) << 8) )

I'll do this with my next PR, while looking in to checksum issues.

Mellvik commented 4 years ago

On the 386 IRQ9 and 5 seems OK, neither makes 'noise' on IRQ1. This confirms my suspicion that something is strange with the hardware on your other PC :) (Yes, I know it works with DOS, but lets just say it's finicky, and needs something additionally done in software to make it work properly on ELKS). I suggest continued testing only on the 386 for the time being.

Indeed, @ghaerr. I like 'finicky'. And yes, I'll stick to the 386 'until further notice'. I'm also sufficiently curious about this to fire up a 2nd 286 system to see if the symptom is unit or product related. There may even be bios tricks at play to compensate for hw 'irregulaities'. Come to think of I'll fire up early (pre 32bit) minix on the machine and see what happens. It also sounds like you are confirming the ability to (finally) remap IRQ and addresses on network cards.... ?

Maybe a little premature, but 5 and 9 looks good (@io300, another variable). Need to test a higher irq too. On the IP front, there seem to be a general checksum problem.

Very interesting, as the checksum code was converted from ASM to C during the bcc conversion a at least a couple years ago. This could easily be why ping doesn't work, and possibly everything else when connecting to systems that perhaps ignore bad checksum packets (as they should).

Perhaps I should port the old ASM code, rather than try to debug the new code. It's on my list, I'll just do it.

Agree on both accounts. Go for asm. Is anything connecting either way on the new system? With this new knowledge of bad checksum, I wouldn't be surprised if nothing works.

Nothing works - and there may be other interesting observations to be made, but I ran out of time - again. It is very strange that we've not seen this problem before (nor do I on QEMU, unless perhaps QEMU is ignoring checksums and rewriting packets itself).

An (in)famous shortcut - wouldn't be surprised...

--M

Mellvik commented 4 years ago

Further inspection shows it may not be the ASM to C conversion, but rather switched addresses at a time after chksum calculation due to the ethernet implementation. I will need to see if macOS can turn on packet debug printing for kernel TCP/IP. Your trace shows that checksum errors are sometimes present in IP, always present in ICMP, and sometimes present in TCP also, do you agree?

Reading the packet trace again, I am confused: icmp checksums from elks are consistently wrong, but when looking at the tcp (telnet) exchange, it's the linux (raspbian) packets that are reported as wrong. I need to look closer into this particular exchange, maybe connect a mac to that network segment to get a reference. Tomorrow is our 4th of july, so it may not happen till monday.

--M

ghaerr commented 4 years ago

I am confused: icmp checksums from elks are consistently wrong, but when looking at the tcp (telnet) exchange, it's the linux (raspbian) packets that are reported as wrong.

I'm glad you found this. I've been too deep into the checksum code, which I don't dare change until I can get tcpdump options figured out to report only the QEMU session packets, as switching off the Wifi on my macOS laptop also switches off the entire TCP stack, which shuts down ELKS -> macOS telnetd as well. tcpdump 'tcp port 23' doesn't seem to work.

However - reading the manuals over here, it looks like ICMP echo has its own checksum field, and guess what? ELKS doesn't compute it, so that's a problem right there. Not sure whether that has to do with not getting a PING response from ELKS on your Pi other other system though.

ghaerr commented 4 years ago

Hello @Mellvik,

I think this is a breakthrough!

Unfortunately, after way, way too many hours, I've determined there are no IP or TCP checksum problems at all. tcpdump reports checksums incorrectly. (https://sokratisg.net/2012/04/01/udp-tcp-checksum-errors-from-tcpdump-nic-hardware-offloading/). I finally got Wireshark installed, it shows no packet problems. I thought it strange that the macOS (or any other stack) would allow IP datagrams or TCP segments with invalid checksums, as they are the primary method of checking validity. I tested this theory finally by purposely putting a bad checksum in TCP and IP headers separately. Immediately, all connectivity stopped, the OS X stack would not accept any IP or TCP packet with an incorrect checksum.

So - from now on out I'd like to get debugging information from the ktcp stack itself, rather from other tools not well understood or tested.

Given all this, I'm not sure your network card actually works yet in the 386 on ELKS, there are so many variables. You should be able to get a basic TCP/telnet connection with ktcp at this point, it does work. I'm not sure whether the ethernet driver works, it may be quite finicky.

As you can read in the PR, I finally got ktcp connections working directly on ELKS standalone, which will allow you to see the stack run connecting to itself. Unfortunately, without debugging on, the stack crashes quickly full debugging is on (very strange). When running debugging to the serial console, ktcp seems to run quite well.

I did add an IP checksum calculation to the ICMP echo response, perhaps that will allow an outside ping to work, I can't test it. Let me know what you find.

I'm going to focus on working on the stack self-contained on ELKS, as it does involve any hardware interfaces or lost packets, and there's still memory problems.

Mellvik commented 4 years ago

@ghaerr, I appreciate your frustration, but only partly agree with your analysis,

Unfortunately, after way, way too many hours, I've determined there are no IP or TCP checksum problems at all. tcpdump reports checksums incorrectly. (https://sokratisg.net/2012/04/01/udp-tcp-checksum-errors-from-tcpdump-nic-hardware-offloading/).

Admittedly the problem description (with ethernet offloading) rings a bell, and I had no idea it applied to tcpdump in 2020. However, it doesn't apply to our case - except to explain my comment yesterday about checksum errors on TCP packets coming from the Raspberrypi. AFAIK tcpdump reports correctly on all packets coming from elks. The ICMP checksum errors are probably genuine, and it reports correct checksums on TCP packets coming from elks. ELKS only supports NE2K ethernet cards, to my knowledge they never had offloading of any kind. They may even be older than the concept of offloading at the interface level. The RaspberryPi Is a different story, thus the checksum errors mentioned above. So - from now on out I'd like to get debugging information from the ktcp stack itself, rather from other tools not well understood or tested.

IMO the checksum problem doesn't make tcpdump useless, in particular since the problem doesn't apply to elks. It has been around forever for a reason. It's surprising that the checksumming - being a known problem for so long - has not been fixed or disabled. To my knowledge there are no perfect or bugfree tools out there, and to me tcpdump is still the fast and easy - and dependable - way to see what's really on the wire. Given all this, I'm not sure your network card actually works yet in the 386 on ELKS, there are so many variables. You should be able to get a basic TCP/telnet connection with ktcp at this point, it does work. I'm not sure whether the ethernet driver works, it may be quite finicky.

I agree - the proof is still pending. OTOH the ethernet (ne2k) driver - if I remember correctly - was thoroughly tested and 'approved' during our prevous round of networking fixes a year or two back. I did add an IP checksum calculation to the ICMP echo response, perhaps that will allow an outside ping to work, I can't test it. Let me know what you find.

I believe this will fix the ping issue. If it doesn't I'll compare the packet exchanges with a working one, byte by byte if necessary.

--Mellvik

ghaerr commented 4 years ago

Hello @Mellvik,

The latest PR #640 should solve a number of these problems.

Just after committing it, I also found that when telnetting out from ELKS, incoming data can overrun ELKS ability to receive it. A forced small window size seems to fix this for the time being. If you experience this, add 'len = 255;' in line 354 of ktcp/tcp_output.c:

    len = CB_BUF_SPACE(cb) - PUSH_THRESHOLD;
    if (len <= 0)
        len = 1;                /* Never advertise zero window size */
+len = 255;              //FIXME fixes external data in overrun in tcp read for now

After this fix, telnet in and out of ELKS runs quite well. Localhost operations are still shaky.

Mellvik commented 4 years ago

Just after committing it, I also found that when telnetting out from ELKS, incoming data can overrun ELKS ability to receive it. A forced small window size seems to fix this for the time being. If you experience this, add 'len = 255;' in line 354 of ktcp/tcp_output.c:

len = CB_BUF_SPACE(cb) - PUSH_THRESHOLD;
if (len <= 0)
    len = 1;                /* Never advertise zero window size */

+len = 255; //FIXME fixes external data in overrun in tcp read for now After this fix, telnet in and out of ELKS runs quite well. Localhost operations are still shaky.

@ghaerr, I'm testing this as we speak. Just a comment about overruns: IMO they should not be considered a problem. Overruns are as natural as rain, as is discarding packets - unavoidable. Retransmits fix that if the protocol is reliable, which tcp is. Routers and bridges and hosts discard packets all the time. The rule -AFAIK- is buffer full/no capacity = discard. Adjusting window size is tuning, and now that things start to work for real, it's time for tuning.

-M

Mellvik commented 4 years ago

@ghaerr,

again a heads up - there seem to be a memory corruption in arp.c, which would explain why we don’t see the same problems in loopback and possibly in qemu. More test results later today.

len = CB_BUF_SPACE(cb) - PUSH_THRESHOLD;
if (len <= 0)
    len = 1;                /* Never advertise zero window size */

+len = 255; //FIXME fixes external data in overrun in tcp read for now After this fix, telnet in and out of ELKS runs quite well. Localhost operations are still shaky.

This seem to have positive influence on outgoing telnet, but still I only get to the shall prompt.

—M

pawosm-arm commented 4 years ago

Been trying latest changes this morning. On a plus side, it's more stable, system does not crash from telnetd or from pinging ELKS from the outside as it used to ('icmp' messages are written to the ELKS console now). The biggest issue is the slowness of this communication. It's extremely slow (I've tried various bauds, yet most of the communication I've tested at 1200bps and it seems as 'good' as 300bps). And on the ELKS side lots of bad checksum messages are printed.

ghaerr commented 4 years ago

Hello @pawosm-arm,

Thanks for your testing, I'm glad to hear the slip side is still working, and more stable.

system does not crash from telnetd

Are you able to login to ELKS from outside and use it, albeit very slowly?

or from pinging ELKS from the outside as it used to ('icmp' messages are written to the ELKS console now).

Glad to hear ping is fixed. The ICMP message on ping received will probably be moved to a debug printf once things are working well.

And on the ELKS side lots of bad checksum messages are printed.

Is this the "tcp: BAD CHECKSUM ..." message? This is a serious issue and I will need to debug further. What system are the incoming packets coming from, and do the bad checksums happen sporadically (e.g., from ping), or consistently?

I notice further that ELKS isn't checksum checking IP headers, only TCP headers. I will add that.

It's extremely slow (I've tried various bauds, yet most of the communication I've tested at 1200bps

No doubt at 1200bps!! My plan has always been to get the serial driver working better. When running at higher baud rates (9600-19200), do you see any "serial: data overrun" messages?

Also, do you know whether you are running with CONFIG_CSLIP enabled in .config? I plan on removing this config option, which causes ktcp to be compiled differently, and move it into a command line option at ktcp startup.

Thanks!

pawosm-arm commented 4 years ago

Are you able to login to ELKS from outside and use it, albeit very slowly?

Yes

Is this the "tcp: BAD CHECKSUM ..." message? This is a serious issue and I will need to debug further. What system are the incoming packets coming from, and do the bad checksums happen sporadically (e.g., from ping), or consistently?

It's Gentoo Linux on the other side.

Messages are repeated approx. every 10sec. and are usually accompanied by other messages. Some example serie:

tcp: BAD CHECKSUM (8381) len 24
tcp: RETRANS limit, timeruse 6
retrans retry #1 rto 244

When running at higher baud rates (9600-19200), do you see any "serial: data overrun" messages?

No I don't see 'overrun' messages. Yet as I tried 4800, I could only go as far as this:

$ telnet 192.168.0.2
Trying 192.168.0.2...
Connected to 192.168.0.2.
Escape character is '^]'.
login: root
# ps
  PID   GRP  TTY USER STAT CSEG DSEG  HEAP   FREE   SIZE COMMAND
    1     0      root    S 3e12 3fbc  3072   5056  18608 /bin

... and it stuck like that for about an hour. ELKS didn't die, those messages mentioned above are still keep on appearing.

Also, do you know whether you are running with CONFIG_CSLIP enabled in .config?

CSLIP is enabled in ELKS. Also according to manual page, it's default for slattach and is changed by command-line option:

[-p proto]
              Set a specific kind of protocol to use on the line.  The default is set to cslip, i.e. compressed SLIP.  Other possible values are slip (normal  SLIP),  adaptive  (adaptive  CSLIP/SLIP),  ppp
              (Point-to-Point  Protocol)  and  kiss  (a protocol used for communicating with AX.25 packet radio terminal node controllers).  The special argument tty can be used to put the device back into
              normal serial operation.  Using 'ppp' mode is not normally useful as ppp requires an additional ppp daemon pppd to be active on the line. For kiss connections the kissattach program should be used.
ghaerr commented 4 years ago

Thanks for the detail!

Once ktcp shows one of the "tcp: RETRANS" or retrans retry messages, it's pretty much toast. These messages are meant to show the TCP packet retransmission algorithm (which does kind of work), but I've found that ktcp is in big trouble internally by the time these shows up.

The second issue, that of remote program output hanging in telnet, is very related - the two sides get desynchronized, display and processing stop, and the RETRANS code kicks in.

The root reason for this I've determined to be something hard to describe: basically, when a packet arrives, ktcp must pass that received data to the ELKS application which is executing a read. The problem seems to be that ELKS doesn't schedule the application to run as often as it needs to, and ktcp ultimately gets a packet and sees there is no more space in the application input buffer. It then drops that packet, and never recovers (that's the RETRANS limit, which is the # of packets waiting to be retransmitted). I'm contemplating a fix that delays ktcp sending an ACK until the application read buffer is larger.

Thus, even for 4800 baud with no data overrun (or, I'm sure, even with hardware handshaking) - ELKS still can't keep up, the way both ktcp and the kernel sockets interface is coded. I'm working on that, but don't yet fully understand it.

This can be temporarily fixed by inserting a very small window size in ktcp/tcp_output.c line 354: 'len = 255'. You might try that. This advertises a 255 byte window size to the sender, and ELKS can keep up with that small a packet size, even at the increased overhead of more packets, because the application can swallow the data given its scheduled time slice received.

Complicated!

CSLIP is enabled in ELKS. Also according to manual page, it's default for slattach

Thanks. I'll make sure that CSLIP is the default, and add an option in ktcp for regular SLIP at startup.

pawosm-arm commented 4 years ago

This can be temporarily fixed by inserting a very small window size in ktcp/tcp_output.c line 354: 'len = 255'. You might try that. This advertises a 255 byte window size to the sender, and ELKS can keep up with that small a packet size, even at the increased overhead of more packets, because the application can swallow the data given its scheduled time slice received.

I've tried this. Doesn't seem to make any difference...

Mellvik commented 4 years ago

This can be temporarily fixed by inserting a very small window size in ktcp/tcp_output.c line 354: 'len = 255'. You might try that. This advertises a 255 byte window size to the sender, and ELKS can keep up with that small a packet size, even at the increased overhead of more packets, because the application can swallow the data given its scheduled time slice received.

I've tried this. Doesn't seem to make any difference...

FWIW - that's my experience too.

I'm diving back into the testing tomorrow - and @ghaerr, I have a request: Any chance you have an outgoing ping program for elks in the works?

BTW - ICMP (IP) checksums are now OK. The BAD CHECKSUM error message popped up recently, and may well be part of the underlying memory corruption problem. I also suspect that may be the case with the RESTRANS limit error: When tracing outgoing telnet last week, I discovered that ELKS suddenly processed an ARP packet that was not on the wire (and no interrupt), then went into a loop. Also, like @ghaerr pointed out before, adding and removing debug options in config.h changes how kTCP works.

Thanks @ghaerr for the walkthrough of the possible scheduling problems. The more we understand, the more targeted tracing.

—Mellvik

ghaerr commented 4 years ago

Hello @Mellvik,

Any chance you have an outgoing ping program for elks in the works?

Ping is pretty easy to write, but as you may know there is no direct "sockets" way to program one. Since ktcp is not in the kernel, there's no easy way for a ping to talk to ktcp. I just thought of a way, using the same "faked" connection that netstat uses though. Its on my large list of ktcp enhancements.

BTW - ICMP (IP) checksums are now OK.

Good to know, thanks!

The BAD CHECKSUM error message popped up recently, and may well be part of the underlying memory corruption problem.

Possible, but I need to know whether it is intermittent, and which external system the packets are being sent from to debug that. I also just added IP checksum as well, so now both IP and TCP will be checked.

Also, like @ghaerr pointed out before, adding and removing debug options in config.h changes how kTCP works.

I've tracked down most of the config.h/debug option difference problems as having to do with the relative timing of the packets received, and ktcp/telnet/telnetd process scheduling - the timing issues are getting quite complicated.

I also suspect that may be the case with the RESTRANS limit error: When tracing outgoing telnet last week, I discovered that ELKS suddenly processed an ARP packet that was not on the wire (and no interrupt), then went into a loop.

Yes, certainly could be. I have found that the retransmit code doesn't seem to work correctly to recover a lost packet, so there could be even more to that as well.

There are currently four MAJOR bugs in ELKS networking that I'm aware of:

I have another version that allows all of the above to be shown, but little in the way of fixes so far. Its getting to be a major time sink trying to debug it without a more sophisticated debugger.

Mellvik commented 4 years ago

Thank you, @ghaerr, This is very useful indeed. As to the ping (outgoing) issue - it may in fact be a good idea to move some of the low level protocol stuff into the kernel - at some point. I'm approaching the problems from the outside so to speak, starting with icmp where I found a couple of nasty small bugs this afternoon. Summay tomorrow.

--M

 Hello @Mellvik,

Any chance you have an outgoing ping program for elks in the works?

Ping is pretty easy to write, but as you may know there is no direct "sockets" way to program one. Since ktcp is not in the kernel, there's no easy way for a ping to talk to ktcp. I just thought of a way, using the same "faked" connection that netstat uses though. Its on my large list of ktcp enhancements.

BTW - ICMP (IP) checksums are now OK.

Good to know, thanks!

The BAD CHECKSUM error message popped up recently, and may well be part of the underlying memory corruption problem.

Possible, but I need to know whether it is intermittent, and which external system the packets are being sent from to debug that. I also just added IP checksum as well, so now both IP and TCP will be checked.

Also, like @ghaerr pointed out before, adding and removing debug options in config.h changes how kTCP works.

I've tracked down most of the config.h/debug option difference problems as having to do with the relative timing of the packets received, and ktcp/telnet/telnetd process scheduling - the timing issues are getting quite complicated.

I also suspect that may be the case with the RESTRANS limit error: When tracing outgoing telnet last week, I discovered that ELKS suddenly processed an ARP packet that was not on the wire (and no interrupt), then went into a loop.

Yes, certainly could be. I have found that the retransmit code doesn't seem to work correctly to recover a lost packet, so there could be even more to that as well.

There are currently four MAJOR bugs in ELKS networking that I'm aware of:

The socket code implementing networkread sometimes returns (and copies) WAY larger than the size asked for. Telnet is asking for 1024 bytes into its 1k buffer, and receives 24k bytes - which trashes the program data area and crashes. This is a nasty kernel problem having to do with how ktcp communicates with the kernel for socket calls. I have it totally reproducing, but don't yet understand why it happens. One can only imagine how bad it is when read doesn't work! telnet sometimes hangs after/in connect, and busy loops in kernel inet_select(). Same kernel timing problem as above. I've got a fix to ^C out of telnet for the time being. The data segment of ktcp at location DS:0004 sometimes gets overwritten after a call to connect. This always trashes two bytes at DS:0004. This happens infrequently and I have test code to show it now. Very strange. ELKS networking does not observe a sending window size limit, and when packets come in to frequently, or always on localhost, ktcp's read buffer's get near overrun. This happens every time on localhost connections, as telnetd writes data too quickly and there's no kernel code to unschedule telnetd when the write should be delayed, or when the receive window is full on telnet. This causes ktcp read buffers to near overflow. If I change the code to drop the packet, instead of overrunning the buffer, ktcp loses sync and retransmit doesn't work. This inability to either work with dropped packets or keep up with too much received data is the original reason ktcp was running out of memory. I have another version that allows all of the above to be shown, but little in the way of fixes so far. Its getting to be a major time sink trying to debug it without a more sophisticated debugger.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Mellvik commented 4 years ago

Finally - some real catches in this neck of the woods.

I don't know whether it qualifies as a race condition, but ARP and ICMP are stepping on each other in deveth.c.

The symptom that caught my attention was an occasional checksum error in the first PING reply from ELKS, as reported by tcpdump. Turned out it was completely repeatable:

If the Raspi ARP cache has timed out, so that it initiates ans ARP exchange before the first ping, everything is fine - there is no ARP intermingled with the ICMP.

So - this may be totally irrelevant for the big picture - who cares about an occasional ICMP checksum error? Then again, ARPs happen all the time and for all I know the bug may have other consequences.

@ghaerr, I'm not qualified to fix this - let me know if you need more details.

BTW - here's a 'packet trace' - just the 64 bytes ICMP echo reply packet (payload) as sent to the checksum routine, then what's actually ending up on the wire:

chk: 64 - 0000 0000 f654 0100 b6f6 5ecf 9af9 0005 0908 0b0a 0d0c 0f0e 1110 1312 1514 1716 1918 1b1a 1d1c 1f1 e 2120 2322 2524 2726 2928 2b2a 2d2c 2f2e 3130 3332 3534 3736

snd: [64] 00 12 34 56 0a 00 02 0f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7a b1 a3 4b 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37

Two minor bugs - unrelated to the above:

In icmp.c

In vjhc.c

—Mellvik

ghaerr commented 4 years ago

I don't know whether it qualifies as a race condition, but ARP and ICMP are stepping on each other in deveth.c.

Congratulations @Mellvik, you have just identified another major corruption error!

This all relates back to ARP "wait" hack that was inserted about three years ago, which is the basis for unresolved issue https://github.com/jbruchon/elks/issues/67#issue-210345739. Thanks to your finding, I now understand how that commit is also overwriting received IP packets as well.

Here's what is happening: deveth_process is called from the ktcp main loop whenever /dev/eth (the ethernet card) has input. You'll see that deveth_process reads the packet into a static 'sbuf' and calls either ip_recvpacket or arp_recvpacket, depending on the type. In your case, its an ICMP packet, which ip_recvpacket gets, and sends to icmp_process. The ping is received, and a reply packet is prepared, and ip_sendpacket is called to send it. So far so good.

ip_sendpacket has the icmp echo reply, and needs an ethernet address, (the bug occurs only when ktcp is running on ethernet and doesn't have a cached ARP entry), so it calls arp_request. arp_request builds an ARP request, and here, (the ARP "wait" hack), it writes the ARP request on the wire, but then does a forced wait outside the main loop in select, for another packet to be recieved (which could be anything, the subject of issue #67). In order to process that received packet, which is usually but not always an ARP reply, arp_request calls deveth_process. Guess what deveth_process does? The same thing it did when called at the start of this discussion - it reads a packet into 'sbuf' - the very same buffer that ip_sendpacket and icmp_process are still in the process of using to get the ICMP reply out. So the received ARP reply stomps right on top of the IP packet waiting for an ARP reply.

A quick fix would be to keep separate buffers for received ARP vs IP packets - but that's another hack when what is needed is to remove the ARP wait hack in the first place, which also fixes #67. I need a bit of time to consider how to do this. It appears this major bug occurs anytime a received IP packet generates an IP packet to send, that needs an ARP request. This is generally the case for any received TCP packet for which an TCP ACK reply is sent, but I'm not seeing the bug that frequently, although I think QEMU does some automatic with ARPing and could be generating an ARP beforehand which fills the ktcp cache, thus sidestepping this buffer corruption!

For below, thanks for finding the length bug:

-       len = ntohs(iph->tot_len) - IP_HLEN(iph);

+       len = ntohs(iph->tot_len) - (int)4*IP_HLEN(iph);

Not sure about this one - did you test it:

In vjhc.c
-       ip_hdr->check= ip_calc_chksum((char *)ip_hdr, state->s_ip_hdr_len);
+       ip_hdr->check= ip_calc_chksum((char *)ip_hdr, (int) 4 * state->s_ip_hdr_len);

You'll notice prior to that in the same file:

ip_hdr_len= (ip_hdr->version_ihl & IH_IHL_MASK) * 4;

which is used to set state->s_ip_hdr_len a few lines later.

Mellvik commented 4 years ago

A quick fix would be to keep separate buffers for received ARP vs IP packets - but that's another hack when what is needed is to remove the ARP wait hack in the first place, which also fixes #67.

Yes, I actually did that - quick and dirty - while testing yesterday – and I agree, it's an ugly hack. It works but I ran into trouble (needed more logic) when things happen out of (expected) sequence - quite common when running on slow hardware (such as several pings arriving before the arp sequence has completed). In a real world situation there will be ARP traffic, possibly pings, in addition to regular traffic from multiple sources and this (my) hack certainly won't cut it. I need a bit of time to consider how to do this. It appears this major bug occurs anytime a received IP packet generates an IP packet to send, that needs an ARP request. This is generally the case for any received TCP packet for which an TCP ACK reply is sent, but I'm not seeing the bug that frequently, although I think QEMU does some automatic with ARPing and could be generating an ARP beforehand which fills the ktcp cache, thus sidestepping this buffer corruption!

I suspect QEMU acts 'normally'. The bug doesn't really affect operation for this particular case. It corrupts an ICMP ECHO reply, but you wouldn't notice w/o packet tracing. Ping (normally) doesn't. The reason I attacked this was the suspicion that it may have other consequences that may be even harder to track down.

My motivation for the sbuf hack was to eliminate this possibility while tracking down other bugs. It may or may not do that, I realize now. BTW, IMO it makes most sense to move ARP into the kernel. That will take some extra kernel bytes but keep the ARP traffic isolated and low latency.

Not sure about this one - did you test it:

In vjhc.c— No, I didn't. And you're right - this is not a bug, my mistake.

I'll keep my sbuf-hack for now while tracking down the ktcp-loops-after-24-pings problem. I find it interesting that 24 times 64 (the icmp echo request payload) adds up to 1538, the max ethernet packet payload. This cannot be incidental ...

--Mellvik

Mellvik commented 4 years ago

Short update on the ping-testing:

It seems I'm hitting a wall on this one, leaning towards the conclusion that the problem is related to the select/read issue mentioned by @ghaerr a few days back.

Here's what happens:

  int len = p_read (devfd, buf, MAX_PACKET_ETH);
  if (len == -1) {
        perror("deveth: Error in read ");
        return;
  }
  if (len < sizeof(eth_head_t)) { /* now nonfunctional */
        printf("DEVETH: Runt packet %d\n", len);
        return;
  }

I need some advice as to how to proceed with this. What am I missing?

--Mellvik

ghaerr commented 4 years ago

Hello @Mellvik!

It is amazing how many bugs are in the ktcp stack, for sure! Thanks for your continued testing, we'll get this figured out...

not detectable by the current code because if (len < sizeof(eth_head_t)) for some reason doesn't work

The reason for this is that sizeof is a size_t which is unsigned; in a C comparison, if either side is unsigned, an unsigned comparison is used, so -1 is higher, not lower. Using an (int) cast in front of sizeof(...) should fix that bug.

With regards to the 24 packets, it seems the problem is in the kernel side of the network stack, which we've only started to get into. The 512 return is EREST

ghaerr commented 4 years ago

Oops... With regards to the 24 packets, it seems the problem is in the kernel side of the network stack, which we've only started to get into. The 512 return is ERESTARTSYS, which has been used in various kernel networking code to indicate that the networking system is in an unstable state and should be rebooted.

I'm also noticing it in the main ethernet driver etc-main.c: the driver will return ERESTARTSYS if the low-level ethernet driver returns an error on packet read or write. I'm not yet familiar with ethernet hardware, but it seems to return that error if the "ring buffer" is empty after an interrupt, and also if the packet sized is 0 or > 1528.

ghaerr commented 4 years ago

You might stick a printk at line 66 of etc-main.c, when the low-level ethernet code returns nonzero, to see if this is the location of the ERESTARTSYS error you're getting. It is possible that the card needs to be reset for some reason after this.

On another note: if you look at line 124 of eth-main.c, you'll see issue #133 referenced, and a fix ensuring the packet size is a minimum 64. I'm now thinking that your earlier guess that 64 * 24 = 1536 is the culprit here: your card is doing something the interrupt routine can't handle after 24 64-byte packets have been received, and likely returning ERESTARTSYS as discussed above. It would be interesting to know whether a full-size packet being received would "discard" the apparent ring buffer of < 24 64-byte packets and reset its counter, so to speak.

You might look closely at the ne2k_int_stat routine in ne2k-mac.s, it looks like RX ring overflow isn't being handled. Also, as a temp measure, you could call ne2k_reset() in the eth-main.c driver instead of return ERESTARTSYS.

I'll have to read up on ethernet cards to be of more help.

Another thought would be to look at other OS ethernet drivers... does anyone know where the ELKS driver came from?

Mellvik commented 4 years ago

I'll be d..ned, I need to hit the wall more often. All this energy ... & new angles. Thanks.

You might stick a printk at line 66 of etc-main.c, when the low-level ethernet code returns nonzero, to see if this is the location of the ERESTARTSYS error you're getting. It is possible that the card needs to be reset for some reason after this.

Good idea! On another note: if you look at line 124 of eth-main.c, you'll see issue #133 referenced, and a fix ensuring the packet size is a minimum 64. I'm now thinking that your earlier guess that 64 * 24 = 1536 is the culprit here: your card is doing something the interrupt routine can't handle after 24 64-byte packets have been received, and likely returning ERESTARTSYS as discussed above. It would be interesting to know whether a full-size packet being received would "discard" the apparent ring buffer of < 24 64-byte packets and reset its counter, so to speak.

I'll put in a different card just to make sure. You might look closely at the ne2k_int_stat routine in ne2k-mac.s, it looks like RX ring overflow isn't being handled. Also, as a temp measure, you could call ne2k_reset() in the eth-main.c driver instead of return ERESTARTSYS.

Wouldn't that be noticeable in the other traffic types? Or maybe that's what it is - noticeable.. I'll have to read up on ethernet cards to be of more help.

Don't -there are too many other things that require intimate understanding of the internals of the system. The ethernet i/f is on the edge. And even though most of us don't move at your speed @ghaerr, i'm sure your process table is overloaded already. We'll figure the h/w stuff out - with some hints now and then. Another thought would be to look at other OS ethernet drivers... does anyone know where the ELKS driver came from?

I don't, but there are plenty of examples to compare to.

Just can't wait to get a tftp server stable... enabling completely remote debugging of phys hw.

More testing late next week.

--Mellvik

Mellvik commented 4 years ago

ARP botch:

Elks will answer any and all arp reqests. Neither arp.c nor deveth.c checks that the request is for its ip address.

-- M

  1. mai 2020 kl. 19:20 skrev Gregory Haerr notifications@github.com:

 You might stick a printk at line 66 of etc-main.c, when the low-level ethernet code returns nonzero, to see if this is the location of the ERESTARTSYS error you're getting. It is possible that the card needs to be reset for some reason after this.

On another note: if you look at line 124 of eth-main.c, you'll see issue #133 referenced, and a fix ensuring the packet size is a minimum 64. I'm now thinking that your earlier guess that 64 * 24 = 1536 is the culprit here: your card is doing something the interrupt routine can't handle after 24 64-byte packets have been received, and likely returning ERESTARTSYS as discussed above. It would be interesting to know whether a full-size packet being received would "discard" the apparent ring buffer of < 24 64-byte packets and reset its counter, so to speak.

You might look closely at the ne2k_int_stat routine in ne2k-mac.s, it looks like RX ring overflow isn't being handled. Also, as a temp measure, you could call ne2k_reset() in the eth-main.c driver instead of return ERESTARTSYS.

I'll have to read up on ethernet cards to be of more help.

Another thought would be to look at other OS ethernet drivers... does anyone know where the ELKS driver came from?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

ghaerr commented 4 years ago

Elks will answer any and all arp reqests. Neither arp.c nor deveth.c checks that the request is for its ip address.

It looks like deveth.c line 94 checks for two MAC addresses: broadcast and ELKS. I'm thinking that ARP requests with a broadcast header (all 0xFF) are coming in through there, into arp_recvpacket, adding the request to the cache (which may not make sense if the request is a query, not a response), then issuing an ARP reply regardless (line 199 in arp.c).

I can't exactly debug this in QEMU, is ELKS responding to all ARP requests with its own MAC/ethernet address? I think we can fix this by comparing the requested src IP in arp.c line 198 and not call arp_reply if its local_ip. Perhaps test this by adding the line if (arp->ip>src == local_ip) before arp_reply on line 199.

I'll read up a bit on ARP headers to see whether the call to arp_cache_add is appropriate. Note that ktcp doesn't timeout ARP entries either, so there's still work to be done.

Mellvik commented 4 years ago

Elks will answer any and all arp reqests. Neither arp.c nor deveth.c checks that the request is for its ip address.

It looks like deveth.c line 94 checks for two MAC addresses: broadcast and ELKS. I'm thinking that ARP requests with a broadcast header (all 0xFF) are coming in through there, into arp_recvpacket, adding the request to the cache (which may not make sense if the request is a query, not a response), then issuing an ARP reply regardless (line 199 in arp.c).

Yes, this is what I'm reading too. This is not per the recommendation in RFC 826. It may in fact be argued that with such a small ARP cache, this behaviour is bad news for ELKS on a populated network. More on that below.

While reading the ELKS ARP code I believe I'm seeing a related but different problem. arp_recvpacket calls arp_cache_add before doing any processing (other than checking the ether addresses), which is per the RFC guidelines. arp_cache_add calls arp_cache_get to check if the IP address is already in the cache. Still correct. THEN arp_cache_get OVERWRITES the ether-address from the call with the ether address in the cache, thus not only passing the opportunity to correct the entry if it's wrong, but also preserving incorrect info in the cache. This is not per the RFC guide for obvious reasons. @ghaerr, are you seeing the same?

I can't exactly debug this in QEMU, is ELKS responding to all ARP requests with its own MAC/ethernet address?

Yes. I think we can fix this by comparing the requested src IP in arp.c line 198 and not call arp_reply if its local_ip. Perhaps test this by adding the line if (arp->ip>src == local_ip) before arp_reply on line 199.

Yes, this is per the RFC recommendation - which essentially says 1) check the cache, update cached ether_addr if IP_addr found, 2) check if it's we are the destination (our IP in the request), 3) add to cache if not found in step 1, 4) send arp-reply:

"" [optionally check the protocol length ar$pln] Merge_flag := false If the pair <protocol type, sender protocol address> is already in my translation table, update the sender hardware address field of the entry with the new information in the packet and set Merge_flag to true. ?Am I the target protocol address? Yes: If Merge_flag is false, add the triplet <protocol type, sender protocol address, sender hardware address> to the translation table. ""

I'll read up a bit on ARP headers to see whether the call to arp_cache_add is appropriate. Note that ktcp doesn't timeout ARP entries either, so there's still work to be done.

Yes, I realize that - there is a lot of ARP stuff ELKS doesn't do, which is fine for now. I think correcting the behaviour above and adding an arp command that can flush, list, maybe manually populate the cache, will reduce the need for a timed cache to almost zero.

My ktcp branch is too messy to lend itself to a PR for this right now, but I can send a diff when it's tested.

—Mellvik

Mellvik commented 4 years ago

Diff for arp.c fixes, re. message above:

 typedef struct arp_cache_s arp_cache_t;

 static arp_cache_t arp_cache [ARP_CACHE_MAX];
+static int merge_flag = 0;     /* tell arp_cache_get to merge the incoming address */

 static void arp_cache_init (void)
@@ -57,8 +58,14 @@ int arp_cache_get (ipaddr_t ip_addr, eth_addr_t * eth_addr)
                if (!entry->ip_addr)
                    break;
                if (entry->ip_addr == ip_addr) {
-                       memcpy (eth_addr, entry->eth_addr, sizeof (eth_addr_t));
-                       debug_arp("arp: using cached entry for %s\n", in_ntoa(ip_addr));
+                       if (merge_flag) {
+                               memcpy (entry->eth_addr, eth_addr, sizeof (eth_addr_t));
+                               merge_flag = 0;
+                               debug_arp("arp: updating cached entry for %s\n", in_ntoa(ip_addr));
+                       } else {
+                               memcpy (eth_addr, entry->eth_addr, sizeof (eth_addr_t));
+                               debug_arp("arp: using cached entry for %s\n", in_ntoa(ip_addr));
+                       }
                        return 0;
                }
                entry++;

@@ -190,13 +199,19 @@ int arp_request(ipaddr_t ipaddress)
 void arp_recvpacket(unsigned char *packet, int size)
 {
        register struct arp *arp = (struct arp *)packet;
+       int notfound;

        arp_print(arp);
        switch (ntohs(arp->op)) {
        case ARP_REQUEST:
                debug_arp("arp: incoming REQUEST\n");
-               arp_cache_add (arp->ip_src, &arp->eth_src);
-               arp_reply (packet, size);
+               merge_flag++;
+               notfound = arp_cache_get(arp->ip_src, &arp->eth_src); /* possible cache update */
+               if (arp->ip_dst == local_ip) {
+                       if (notfound)
+                               arp_cache_add(arp->ip_src, &arp->eth_src); 
+                       arp_reply (packet, size);
+               }
                break;

        case ARP_REPLY:
ghaerr commented 4 years ago

Hello @Mellvik,

Thanks for the ARP fix!

Is this version RFC 826 compliant, according to your reading of it? I read most of the RFC yesterday.

This fixes the ARP replies being generated by ELKS when it shouldn't, does it fix anything else that can be noted?

I don't have a way to fully test it, but will check what I can on QEMU. I would like to get rid of the merge_flag global and will submit a slightly revised PR with that passed as a parameter to arp_cache_get() for you to test and we'll commit it.

I've been trying to get a PC that will run serial port or ethernet and have struck out twice in the last week. The first system was too modern for an ISA NE2K, but wouldn't boot an ELKS FAT32 MBR USB stick for some reason. The second system was quite ancient but its BIOS didn't support any USB boot. I can't write floppies so USB boot seems to be the way to go. Can't wait to actually get some real hardware that can run ELKS lol :)

Mellvik commented 4 years ago

Is this version RFC 826 compliant, according to your reading of it? I read most of the RFC yesterday.

Cache updates are Now per the RFC recommendations. This fixes the ARP replies being generated by ELKS when it shouldn't, does it fix anything else that can be noted?

The bug pointed out in my previous message is fixed, in addition to the 'automatic' cache update. I don't have a way to fully test it, but will check what I can on QEMU. I would like to get rid of the merge_flag global and will submit a slightly revised PR with that passed as a parameter to arp_cache_get() for you to test and we'll commit it.

I expected that :-) so I took the short route. I've been trying to get a PC that will run serial port or ethernet and have struck out twice in the last week. The first system was too modern for an ISA NE2K, but wouldn't boot an ELKS FAT32 MBR USB stick for some reason. The second system was quite ancient but its BIOS didn't support any USB boot. I can't write floppies so USB boot seems to be the way to go. Can't wait to actually get some real hardware that can run ELKS lol :)

Good luck with that. May I recommend compaq portables on ebay? I've sold 4 so far this year. They're great fun to restore... just got a broken compaq slt386s/20 from australia. Looks great, unfortunately, xircom parallel (and serial) is the only way to network it. Runs win 3.1 :-)

M

.

BTW the ping 24 problem seems to be in the driver - I'm having fun with 86 assembly code right now.

ghaerr commented 4 years ago

May I recommend compaq portables on ebay?

That's a neat idea, I used to have one years ago that I developed on... but I need a system that will boot an ELKS MBR USB, since I have no way of writing a floppy!

Mellvik commented 4 years ago

May I recommend compaq portables on ebay?

That's a neat idea, I used to have one years ago that I developed on... but I need a system that will boot an ELKS MBR USB, since I have no way of writing a floppy!

A standard 1.44 usb floppy works fine on your mac (last time I checked, I'm using the raspi these days - from the mac) ... and costs little..

M

ghaerr commented 4 years ago

@Mellvik,

Thanks for you suggestions on hardware. I bought a Compaq Portable 386 on ebay, along with a Gotek floppy drive emulator. I hope to use this to improve the serial port driver for higher speeds, and, with the single ISA expansion bay, get an NE2K network card up and running for network testing, and ultimately the implementation of TFTP for updating ELKS on its internal hard drive.

Mellvik commented 4 years ago

Congratulations!! Presumably with an expansion box so you get the ISA bus. I used a floppy emulator like the gotek for a while, works fine. If the machine has the original (slow) disk, you may consider replacing it with a CF card IDE socket. Too bad I couldn't sell you one - I'm preparing two 386/20s for ebay right now. Let me know if there is anything you need. BTW - I just noticed that the 3COM 503 card uses the same chipset as the ne2k, so expanding ELKS support to that 3com model would be easy.

ghaerr commented 4 years ago

Presumably with an expansion box so you get the ISA bus

Yes.

I used a floppy emulator like the gotek for a while, works fine.

I've heard that the system has to be booted from 5" floppy first and some sort of setup program run... The Compaq BIOS expects a 1.2M 5.25" drive, will there be a problem if I try running the Gotek as a 1.44M 3.5" drive when attached? I guess I'll have to open the case to even get that working.

the 3COM 503 card uses the same chipset as the ne2k, so expanding ELKS support to that 3com model would be easy.

Just looked on eBay, and the 503 cards have the older BNC connector... what about this one, a 3C509 which has the ethernet connector, do you think that would work?

https://www.ebay.com/itm/Vintage-PC-Hardware-3Com-ISA-Etherlink-III-3c509TP-Ethernet-card/124188175361?_trkparms=aid%3D1110001%26algo%3DSPLICE.SIM%26ao%3D1%26asc%3D225085%26meid%3Daaf9c71aaad94bc5abe27d1fa1c8fcd7%26pid%3D100005%26rk%3D2%26rkt%3D12%26mehot%3Dpf%26sd%3D274236107437%26itm%3D124188175361%26pmt%3D0%26noa%3D1%26pg%3D2047675&_trksid=p2047675.c100005.m1851

Mellvik commented 4 years ago

 Presumably with an expansion box so you get the ISA bus

Yes.

I used a floppy emulator like the gotek for a while, works fine.

I've heard that the system has to be booted from 5" floppy first and some sort of setup program run... The Compaq BIOS expects a 1.2M 5.25" drive, will there be a problem if I try running the Gotek as a 1.44M 3.5" drive when attached? I guess I'll have to open the case to even get that working.

No, booting from a 1,44MB drive from scratch works. If you're getting a machine with a broken battery - most likely - You have to open it anyway. I can send you pictures on how to fix that. Minimal soldering, a cr2032 socket, cr2032 battery and some double sided tape. the 3COM 503 card uses the same chipset as the ne2k, so expanding ELKS support to that 3com model would be easy.

Just looked on eBay, and the 503 cards have the older BNC connector... what about this one, a 3C509 which has the ethernet connector, do you think that would work?

No, go for a real ne2k card, that's easier. A real old one with jumpers instead of programmed config (IRQ, ioaddr ) is an unbelievable simplification. You won't believe the time I spent to find a DOS utility that actually worked with my 'other' ne2k card - to change the IRQ to 9 - some don't even allow that ... or something else.

I'll have a look on ebay. May make sense to take this thread off of github. helge@mymayday.com is good. M

Mellvik commented 4 years ago

Update on ktcp status:

I’m (partly) rewriting the low level of the ne2k driver, now being tested - incoming ping works, arp works, incoming telnet is almost stable.

While finishing up, here’s a (probably minor) challenge for you @ghaerr:

The case is incoming telnet to elks. If the client sends a RESET, tcp.c line 239 handles that fine, cleans up the receiving end. But if there are outgoing packets pending, ktcp panics in tcpdev.c line 310. Seems a send-cleanup is needed too ... here’s the debug output (w/my extensions)

IP: recv tcp packet retrans check buffers 3, mem 255 retrans retry #1 rto 2048 arp: using cached entry for 10.0.2.2 IP: send 10.0.2.15 -> 10.0.2.2 v4 hl:5 tos:0 len:105 id:182 fo:0 ihl:5 chk:c961 (OK) ttl:64 prot:6 |t||r|/C53|B4d/ |RD: 1536 64 94da|IP: recv 10.0.2.2 -> 10.0.2.15 v4 hl:5 tos:0 len:40 id:0 fo:4000 ihl:5 chk:c022 (OK) ttl:64 prot:6 IP: recv tcp packet tcp: RST received, removing retrans packets retrans free buffers 2 mem 170 retrans free buffers 1 mem 85 retrans free buffers 0 mem 0 Free CB ktcp: panic in write

—Mellvik

ghaerr commented 4 years ago

Thanks for the update!

I’m (partly) rewriting the low level of the ne2k driver, now being tested - incoming ping works, arp works, incoming telnet is almost stable.

Nice - did you find particular areas of the NE2K driver that were broken? What were they? BTW, I ordered the network card you recommended for the Compaq 386.

The case is incoming telnet to elks. If the client sends a RESET, tcp.c line 239 handles that fine, cleans up the receiving end. But if there are outgoing packets pending, ktcp panics in tcpdev.c line 310.

Unfortunately I haven't been able to determine the problem with this yet. ktcp has a number of problems on session termination, whether from RST, or from normal or abnormal telnet termination. These are complicated with the kernel/tcpdev design which passes socket information to/from user programs and ktcp. I have been testing using localhost, which only works for a while before hanging. None of the session disconnects work properly and there also seems to be memory corruption errors afterwards as well.