rcornwell / sims

Burroughs B5500, ICL1900, SEL32, IBM 360/370, IBM 7000 and DEC PDP10 KA10/KI10/KL10/KS10, PDP6 simulators for SimH
http://sky-visions.com
95 stars 20 forks source link

IMP DHCP is broken and does not follow the details spelled out in RFC2131 (Dynamic Host Configuration Protocol) #152

Closed markpizz closed 5 years ago

markpizz commented 5 years ago

This was determined by the failure of a DHCP configured system to acquire an IP address. That failure required WireShark examination of the protocol activities which then required review of RFC2131 to compare what was seen to what should have been seen.

What was seen: 1) First packet sent by the IMP DHCP client is a "DHCP Discover" to the LAN broadcast address. So far so good, maybe. The details in the "DHCP Discover" packet may or may not be correct. 2) 210ms later (before anything on the LAN gets a chance to respond), the IMP DHCP client sends a "DHCP Request" packet. This packet is absolutely not proper according to the protocol. It should be sent in response to a DHCP Offer packet from the DHCP Server. The RFC says that "DHCP Discover" should be resent after some 4 seconds (+/-1) with retries doubling each successive time until it exceeds 64 seconds. Note: On some networks (with switches in between the client and the available DHCP Server) the network may take some time to recognize the connectivity of the client system's sending MAC address before responses can land at the right place, so some retries for some clients may be completely normal. 3) 16+ seconds later another "DHCP Discover" packet is sent followed 20ms later by a "DHCP Request" packet. Some approximately 16 seconds later another "DHCP Request" packet is sent which repeats indefinately (only "DHCP Request" packets) every 16 seconds.

It is somewhat surprising that no response came back from the DHCP server. That may be due to one of two reasons: 1) The outgoing "DHCP Discover" was not properly formed. OR 2) The broken protocol immediate sending of a "DHCP Request" packet may have quieted the DHCP server.

A review of the contents of the "DHCP Discover" packet show one difference between the IMP generated packet and a working node on the LAN. This difference is the DHCP_OPTION_CLIENT_ID was not present. It was added to the option list. This didn't change anything since the "DHCP Request" packet is still being sent some 200ms after the "DHCP Discover".

While looking for the logic that causes the "DHCP Request" to be sent, I encounter some strange scheduling activities. It seems that imp_unit[1] is dedicated to IMP timing activities, and it presumes that sim_clock_coschedule() with an interval value of 1000 will somehow be useful to time against wall clock time and provide 1000 calls per second. The interval argument to sim_clock_coschedule() is in units of instructions and not anything related to wall clock. This is quite a frequent calling expectation.

While walking through things I notice that sim_interval is being referenced as if it were the number of instructions that have been executed since the beginning of execution:

case CONI:
    switch (GET_DTYPE(uptr->flags)) {
    case TYPE_MIT:
         last_coni = sim_interval;

AND / Only if there has been a CONI lately. / if (last_coni - sim_interval < CONI_TIMEOUT)

The correct reference should be sim_gtime(). sim_interval has no meaning within any device implementation code.

The doc for the PDP10-KA describes the IMP, but doesn't mention MPX. This has been clarified in the updated ka10_doc.doc file.

Looking through the MTAB entries there exists a SET IMP DHCPIP=a.b.c.d which doesn't actually have a parser for the IP address and manually setting the DHCP server would be meaningless. This has been removed. Additionally, displaying the DHCP server when DHCP is not enabled is meaningless, so the output of SHOW IMP DHCPIP has been changed to say that "DHCP disabled" when it is.

It is not clear why eth_filter() was called specifying that all multicast packets should be received. Maybe so that explicit broadcast packets would be captured since implicitly they include the broadcast address. With all multi-cast packets enabled, specific care must be taken to not misinterpret these packets. It is better to explicitly list the broadcast address as one of the potential addresses accepted in the eth_filter() API. I disabled all multicast and we'll look at the broadcast reception issue later. Disabling all multicast avoided some confusion interpreting some chatty traffic on the LAN and the quick sending of the DHCP REQUEST packets ceased..

Once the misinterpreted traffic was avoided, the imp_dhcp_timer() routine was observed looking at values which hadn't been set yet as a consequence of a successful DHCP negotiation (lease_time, etc).
This is avoided by only examining that when the dhcp_state is set to BOUND. Tracing traffic on the wire, it seems that the DHCP server is explicitly sending the DHCP Offer packet to the broadcast MAC address. As a consequence, the call to eth_filter() needed to be adjusted to specify both the IMP device MAC address and the broadcast address.

The protocol now moved along better, and it generated a DHCP Request packet. However, this packet was malformed. There were multiple subtle problems with both setting and interpreting data in the DHCP packets. These were all related to referencing or providing things which were in network or host byte order consistently. All IP addresses and masks should always be represented, presented and compared in network byte order. Fixing this lets DHCP to work cleanly.

I created a separate unit to schedule wall clock activities against which are only used by DHCP retry activities and lease expiration and ARP aging. This activity has been removed from the unit 1 IMP packet timer.

There was a problem with outbound connections when a NAT: attach was made. In this situation, the IMP traffic was on a simulated LAN within the sim_ether layer and traffic to nodes on the host system's LAN or beyond all had to route through the sim_ether's NAT provided gateway. The outbound IP packet flow would notice that this outbound traffic was not on the local LAN and therefore needed an ARP entry for the gateway. An ARP request was made for the gateway while the original outgoing IP packet was queued for later delivery once an ARP entry was available. Since the actual ARP request that was issued was specifically for the gateway IP address, no ARP replies would be coming in for the original packet's IP address. This problem would be concealed if inbound traffic from the target IP happened first since that arriving traffic would create the needed ARP entry. The solution to this problem was to generate a static ARP entry for the LAN gateway at attach time. This then allows the outgoing traffic to off LAN destinations to avoid sending an ARP since it has already determined that the packet must transit through the gateway and since the gateway ARP is persistently available, things now work.

Along the way to addressing the above problems, several latent problems were found in the sim_ether NAT code. These were not related to the ARP problem, but were merely race conditions at NAT shutdown. The race conditions appeared now since the IMP device's DHCP server sends a DHCP RELEASE during the IMP detach. All prior sim_ether use cases didn't send traffic while closing, so these hadn't been found previously.
Additionally a problem with parsing of illegal NAT arguments were also identified and fixed.

FYI, DHCP is enabled by default since the emulated IMP reflects normal behavior on a modern system. Specifically, setting the IMP IP address or Gateway implicitly disables DHCP. A SHOW IMP displays useful information about what is configured and/or happening. A SHOW IMP ARP displays the current ARP information contained in the IMP device.

These are the cases which work now with the IMP device:

1) Direct connection to the host system LAN interface:

    sim> set IMP enabled
    sim> set IMP mpx=4
    sim> set IMP host=10.3.0.6
    sim> attach IMP eth0

  This allows the simulated system (configured with IP address 
  10.3.0.6) to reach systems on the LAN and across the Internet.  
  The LAN IP address is acquired via DHCP.  The IMP internal
  NAT will translate addresses from the 10.3.0.6 to the LAN's 
  network based on DHCP acquired network parameters. On Windows 
  hosts it also allows traffic to/from the host system.

  After the simulator is running, something like the following 
  will be the output of a SHOW IMP command:

    sim> SHOW IMP     
    IMP     MAC=00:00:02:79:B8:A1, MPX=4, IP=192.168.60.40/24
            GW=192.168.60.6, HOST=10.3.0.6, DHCP Server IP=192.168.60.10, Lease Expires in 691159 seconds
            attached to eth0, DHCP, MIT

2) Direct connection to the host system LAN interface without any NAT, presuming that the host system's network is 10.3.0.0/24 and the gateway is 10.3.0.1:

    sim> set IMP enabled
    sim> set IMP mpx=4
    sim> set IMP GW=10.3.0.1
    sim> set IMP IP=10.3.0.6/24
    sim> attach IMP eth0

  This allows the simulated system (configured with IP address 
  10.3.0.6) to reach systems on the LAN and across the Internet.  
  On Windows hosts it also allows traffic to/from the host system.

  After the simulator is running, something like the following 
  will be the output of a SHOW IMP command:

    sim> SHOW IMP     
    IMP     MAC=00:00:02:79:B8:A1, MPX=4, IP=10.3.0.6/24
            GW=10.3.0.1, HOST=0.0.0.0, DHCP disabled
            attached to eth0, MIT

3) Connectivity to the host system via NAT(slirp):

    sim> set IMP enabled
    sim> set IMP mpx=4
    sim> set IMP host=10.3.0.6
    sim> attach IMP nat:tcp=2323:10.0.2.15:23,tcp=9595:10.0.2.15:95

  This allows outbound traffic from the simulated system (configured
  with IP address 10.3.0.6) to reach the LAN, the Internet and the 
  host system.  The sim_ether NAT layer provides a DHCP server which 
  will assign the address 10.0.2.15 to the simulated system and NAT 
  translation will be performed by the sim_ether layer AND by the 
  internal NAT in the IMP simulator.  The host (and external systems) 
  can telnet into the simulator by telneting to the host system's 
  IP address on port 2323.

  After the simulator is running, something like the following 
  will be the output of a SHOW IMP command:

    sim> SHOW IMP     
    IMP     MAC=00:00:02:79:B8:A1, MPX=4, IP=10.0.2.15/24
            GW=10.0.2.2, HOST=10.3.0.6, DHCP Server IP=10.0.2.2, Lease Expires in 86388 seconds
            attached to nat:tcp=2323:10.0.2.15:23,tcp=9595:10.0.2.15:95, DHCP, MIT

4) Connectivity to the host system via NAT(slirp) using a single layer of NATing:

    sim> set IMP enabled
    sim> set IMP mpx=4
    sim> set IMP gw=10.3.0.1
    sim> set IMP IP=10.3.0.6/24
    sim> attach IMP nat:nodhcp,gateway=10.3.0.1/24,tcp=2323:10.3.0.6:23,tcp=9595:10.3.0.6:95

  This allows outbound traffic from the simulated system (configured
  with IP address 10.3.0.6) to reach the LAN, the Internet and the 
  host system.  The sim_ether NAT layer provides will be the only 
  NAT translation.  The host (and external systems) can telnet into
  the simulator by telneting to the host system's IP address on 
  port 2323.

Connectivity beyond the above examples is also possible (using VDE or TAP, etc.). These methods often require additional host specific network setup to talk to other systems. VDE can be setup and which will allow the simulator process to not run as root.

larsbrinkhoff commented 5 years ago

Thank you, Mark!

CC @eswenson1, these are fixes for IMP using TAP.

markpizz commented 5 years ago

@eswenson1 was the source of the test configurations that I used here. His stuff let me focus on what I needed to do to find the issues.

They won't hurt a TAP based setup, but they are in no way specific to TAP.

They will let a host system have the IMP connected directly to the LAN and if that system happens to have a dedicated interface for this role and a separate interface for normal host system communication to the LAN, then everyone will be happy. Sure, you might say host systems with multiple interfaces would be rare, and that would be true for physical hosts, but, when the host happens to be running on a VM. It also allows Windows based hosts to directly connect to the LAN and to talk to the host and everything else beyond. Host to simulator communications can be easily achieved with a NAT connection, but reaching the simulator behind the IMP would then use non-standard port numbers.

Host to simulator setups can also be achieved with VDE.

At the end of the day, these fixes bring the IMP device up to being a complete user of sim_ether's LAN connectivity (with its features and foibles).

rcornwell commented 5 years ago

Thanks, Will add these in tonight and run tests to make sure that they work. Only issue, tags for KA10 should be KA10 not PDP10-Kx.

If all is well I will move stuff over to the main tree later this week. I am almost at the point of merging KL10 into the master tree, which will hold off some updates.

markpizz commented 5 years ago

You may want to fold each of the kx10,_imp.c changes into a single commit. I started with separate commits initially thinking that the ultimate change set wouldn't be that large, but as I dug into the DHCP behaviors it got large, and along the way the SCP changes were needed.

I recommend picking up new SCP stuff (inclining the slirp directories) first and then adding the IMP pieces.

rcornwell commented 5 years ago

I am current with SCP since Saturday. Should I push the SCP changes back to the master with you as author?

markpizz commented 5 years ago

Please check for latest SCP. I don't recall when I pushed the most recent stuff.

Otherwise, the SCP changes are already in simh/simh.

rcornwell commented 5 years ago

I did some testing of this tonight and it would not get a DHCP address. I will try and check out what packets are being sent tomorrow.

eswenson1 commented 5 years ago

I couldn’t get this version to work either. Could never attach IMP to eth0. A “show eth” showed that eth0 was associated (in SIMH, with tap0. I had no tap0 NIC.

markpizz commented 5 years ago

If SHOW ETH doesn't display your physical interfaces, then it is likely your not running as root, or libpcap-dev isn't installed. If you're running as root, what was output when you built the simulator with make?

Directly injecting and promiscuous mode listening on the physical Ethernet interface is a privileged operation. Some, but not all, systems (those which have a bpf device that is used by libpcap) have a means of setting permissions on the bpf device to allow this without running as root.

eswenson1 commented 5 years ago

Ah yes. I didn't realize I had to run as root. Running as root does allow me to attach the eth0 device, and I get this output, now, when I do "show eth":

sim> show eth
ETH devices:
 eth0   eth0                                 (No description available)
 eth1   tap:tapN                             (Integrated Tun/Tap support)
 eth2   nat:{optional-nat-parameters}        (Integrated NAT (SLiRP) support)
 eth3   udp:sourceport:remotehost:remoteport (Integrated UDP bridge support)
Open ETH Devices:
 IMP    eth0 (No description available)
Ethernet Device:
  Name:                    eth0
  Reflections:             0
  Self Loopbacks Sent:     2
  Self Loopbacks Rcvd:     0
  Host NIC Address:        F2:3C:91:D4:29:2D
  Packets Sent:            4
  Asynch Interrupts:       Enabled
  Interrupt Latency:       0 uSec
  Read Queue: Count:       0
  Read Queue: High:        0
  Read Queue: Loss:        0
  Peak Write Queue Size:   1
  BPF Filter: (((ether dst 00:00:02:2C:4D:6C) or (ether dst FF:FF:FF:FF:FF:FF)))
sim>

However, I'm unable to telnet to my ITS from my host. I the telnet command (kermit) just times out:

eswenson@localhost:~/ex-its-2$ telnet 10.3.0.6
Executing /home/eswenson/.mykermrc...
 Trying 10.3.0.6...

And I'm unable to telnet out from ITS to my host. I have a "nc -l 5000" running on my host, and telnetting to that port on my host's IP address times out as well.

Doing a "show imp" gives me this:

sim> show imp
IMP     MAC=00:00:02:2C:4D:6C, MPX=4, IP=0.0.0.0/0
        GW=0.0.0.0, HOST=10.3.0.6, DHCP Server IP=0.0.0.0, State:SELECTING, Waited 57 seconds
        attached to eth0, DHCP, MIT
sim>

Perhaps I need the second form of config, where GW and IP are given (and not HOST). I'll try that.

I tried that too, with similar results -- no telnet in and no no telnet out. Here is the output of "show imp" and "show eth" in that config:

sim> show imp
IMP     MAC=00:00:02:2C:4D:6C, MPX=4, IP=10.3.0.6/32
        GW=66.175.218.1, HOST=0.0.0.0, DHCP disabled
        attached to eth0, MIT
sim> show eth
ETH devices:
 eth0   eth0                                 (No description available)
 eth1   tap:tapN                             (Integrated Tun/Tap support)
 eth2   nat:{optional-nat-parameters}        (Integrated NAT (SLiRP) support)
 eth3   udp:sourceport:remotehost:remoteport (Integrated UDP bridge support)
Open ETH Devices:
 IMP    eth0 (No description available)
Ethernet Device:
  Name:                    eth0
  Reflections:             0
  Self Loopbacks Sent:     2
  Self Loopbacks Rcvd:     0
  Host NIC Address:        F2:3C:91:D4:29:2D
  Packets Sent:            3
  Asynch Interrupts:       Enabled
  Interrupt Latency:       0 uSec
  Read Queue: Count:       0
  Read Queue: High:        0
  Read Queue: Loss:        0
  Peak Write Queue Size:   1
  BPF Filter: (((ether dst 00:00:02:2C:4D:6C) or (ether dst FF:FF:FF:FF:FF:FF)))
sim>
rcornwell commented 5 years ago

This is what I am seeing:

sim> sh imp IMP MAC=9A:72:60:F7:8C:11, MPX=4, IP=0.0.0.0/0 GW=0.0.0.0, HOST=10.3.0.6, DHCP Server IP=0.0.0.0, State:OFF, Waited 0 seconds attached to tap:tap0, DHCP, MIT sim> sh eth ETH devices: eth0 tap:tapN (Integrated Tun/Tap support) eth1 nat:{optional-nat-parameters} (Integrated NAT (SLiRP) support) eth2 udp:sourceport:remotehost:remoteport (Integrated UDP bridge support) Open ETH Devices: IMP tap:tap0 Ethernet Device: Name: tap0 Reflections: 0 Self Loopbacks Sent: 2 Self Loopbacks Rcvd: 0 Packets Sent: 2 Packets Received: 8 Asynch Interrupts: Enabled Interrupt Latency: 0 uSec Read Queue: Count: 4 Read Queue: High: 4 Read Queue: Loss: 0 Peak Write Queue Size: 0 sim>

markpizz commented 5 years ago

It seems that you are running as non-root (or you don't have libpcap-dev installed). I don't know what behavior to expect when running as non-root, but if a TAP connection worked before, my recent changes, then non root should be OK.

sim> sh imp IMP MAC=9A:72:60:F7:8C:11, MPX=4, IP=0.0.0.0/0 GW=0.0.0.0, HOST=10.3.0.6, DHCP Server IP=0.0.0.0, State:OFF, Waited 0 seconds attached to tap:tap0, DHCP, MIT sim>

This would be completely expected immediately after having done your "ATTACH IMP tap:tap0", before execution has started. Since DHCP is a protocol which has to run within the IMP, like other devices it runs 'in between' instruction execution when events fire on its various units. A SHOW QUEUE command will display the various pending events and when (during instruction execuction) they will fire.

Meanwhile, is there something within the local system's network environment that will be providing DHCP service on the TAP connection? If not, then once instruction execution starts, the DHCP state will change, as the protocol flows along looking for an IP address, but it won't be finding a DHCP server. This would be the same as plugging a wired connection into a switch that doesn't have a router (DHCP server) on the wire.

If you previously had a TAP setup working (without DHCP), then you must have provided explicit IP=ddd.ddd.ddd.ddd/nn and possibly GW=ddd.ddd.ddd.ddd. Those commands will still work and will implicitly cause DHCP to be disabled.

rcornwell commented 5 years ago

Sorry... I have the tap device owned by me. And this is after ITS is up and running. It appears as if DCHP service is no longer being requested. When I get home from work tonight I will see what is going on with wireshark. This setup was working before I applied your patches. I have the MAC address in my router to provide a fixed IP address for ITS machine. It also works if I use a different MAC address, then I get next available from DHCP range.

Also as to the addition of the extra options, they did not apply in the case I was using, hence I did not add them. My DHCP server is a Cisco RV320 so it should be properly handling DHCP requests.

markpizz commented 5 years ago

@eswenson1 says:

Perhaps I need the second form of config, where GW and IP are given (and not HOST). I'll try that.

I listed "some" of the various network situations that folks may have in "walk before you run" order. Where "walk before you run" means simplest to work before the more complicated cases.

Notice that, I started the description with a description of the host networking conditions, and after the configuration steps were spelled out, the specific expectations of what should work were stated, Configurations 1 and 2 explicitly said "reach systems on the LAN and beyond" and also stated that if the host system were Windows, then that host system could also reach the simulator. When I explicitly mentioned Windows, I also meant that on non-Windows systems the host system would not be expected to be able to reach the simulator. Configurations 3 and 4 explicitly stated that the host system can reach the simulator without any mention of a particular host type.

Given the now fully working interface to sim_ether, I'll be editing the simh 0readme_ethernet.txt document to include Kx10 IMP, 3b2 NI, and Unibus XU as the list of simulators that have Ethernet connectivity. This document currently mentions the features and foibles that are involved with getting working Ethernet communications. Most of what I'll be adding there will merely be adding the names of the simulators and their devices. The host and network connection details and complexities are already spelled out.

@rcornwell said:

Sorry... I have the tap device owned by me. And this is after ITS is up and running. It appears as if DCHP service is no longer being requested. When I get home from work tonight I will see what is going on with wireshark. This setup was working before I applied your patches. I have the MAC address in my router to provide a fixed IP address for ITS machine. It also works if I use a different MAC address, then I get next available from DHCP range.

Also as to the addition of the extra options, they did not apply in the case I was using, hence I did not add them. My DHCP server is a Cisco RV320 so it should be properly handling DHCP requests.

This appears to be a networking environment configuration that may be beyond the ones previously discussed in the 0readme_ethernet.txt document, which I possibly haven't encountered before. It sounds similar to a bridged setup that I've seen and the local system's network stack whpasses the traffic out to your wire when needed. Please provide exactly how you setup the tap device on your host system.

rcornwell commented 5 years ago

My setup for Tap is identical to the one I used before your patch which has worked just fine for the last several months.

Expect the KL10 NIA adapter in future after I get KL10 up and running. This will be straight connection to sim_ether since it operates at the ethernet packet layer.

markpizz commented 5 years ago

@rcornwell:

My setup for Tap is identical to the one I used before your patch which has worked just fine for the last several months.

I'm not suggesting it has changed, I'm looking for how it is setup.

Expect the KL10 NIA adapter in future after I get KL10 up and running. This will be straight connection to sim_ether since it operates at the ethernet packet layer.

That will have the same host issues that all the other Ethernet devices have. It makes sense to understand and possibly simplify those now.

rcornwell commented 5 years ago

#!/bin/sh HOSTIP=/sbin/ifconfig enp3s0 | gawk -- '/inet/{ print substr($2,6) }' HOSTNM=/sbin/ifconfig enp3s0 | gawk -- '/inet/{ print substr($4,6) }' HOSTBR=/sbin/ifconfig enp3s0 | gawk -- '/inet/{ print substr($3,7) }' HOSTGW=/sbin/route -n | gawk -- '/^0.0.0.0/{ print $2 }' | head -n 1`

echo "Host Addr ${HOSTIP}"
echo "Netmask ${HOSTNM}"
echo "Broadcast ${HOSTBR}"
echo "Default GW ${HOSTGW}"

/usr/sbin/tunctl -t tap0 -u rich
/usr/sbin/tunctl -t tap1 -u rich
/usr/sbin/tunctl -t tap2 -u rich
/sbin/ifconfig tap0 up
/sbin/ifconfig tap1 up
/sbin/ifconfig tap2 up

Now convert eth0 to a bridge and bridge it with the TAP interface

/sbin/brctl addbr br0
/sbin/brctl addif br0 enp3s0
/sbin/brctl setfd br0 0
/sbin/ifconfig enp3s0 0.0.0.0
/sbin/ifconfig br0 $HOSTIP netmask $HOSTNM broadcast $HOSTBR up

set the default route to the br0 interface

/sbin/route add -net 0.0.0.0/0 gw $HOSTGW

bridge in the tap device

/sbin/brctl addif br0 tap0
/sbin/ifconfig tap0 0.0.0.0
/sbin/ifconfig tap1 0.0.0.0
/sbin/ifconfig tap2 0.0.0.0
echo "nameserver $HOSTGW" >>/etc/resolv.conf
`

eswenson1 commented 5 years ago

I never got TAP to work on my Linode, and haven't tried it with the new pdp10-ka. I will try that again.

I confess I'm no networking expert, and have so far failed to find anything that works on my Linux VPS (Linode) other than using SIMH nat, which I find far less convenient, since I have to map ports between host and ITS. For my main ITS machine, I use KLH10, which supports tun devices, and I've never had any issues with its networking. As far as I'm concerned, SIMH networking is sub-par, and the only reason I'm using pdp10-ka is to help Lars test out the RH10 support, and now to help you guys test networking. If I can't get something better than nat working for my EX system (the one currently using pdp10-ka), I'll just go back to using KLH10 with it.

That said, because I don't understand all this networking stuff, I'd like to present a challenge to you networking gurus: I have a Linux Linode with a single NIC (eth0) with a single (public) ip address. As far as I know, my Linode will not allow me to use DHCP to get another IP address. With tun, I can trivially set up (actually KLH10 sets up) an ITS that can connect to other hosts on the Internet and that can be reached from the Internet. This is what I desire to do with pdp10-ka (SIMH). All I had to do with KLH10 was to configure iptables to do forwarding from my external NIC (eth0) to my tun device (to which KLH10 attached). I was able to use iptables to allow external access to select ports on ITS. And I was able to allow ITS to access my local SMTP server (running on Linux), as well as make outbound connections to Internet hosts.

So all I want (he muses) is to be able to do the same thing with pdp10-ka. If tun is not (for some reason completely uncomprehensible to me) "the right thing to do" in SIMH, then what is?

Finally, I've tried setting up a bridge between eth0 and tap0 (earlier) and I could never get this to work. Every time I attempted the bridge, I'd lose connectivity (from outside) to my Linux host (over eth0). I end up having to use the Linode control panel to attach to the console and destroy the bridge device to restore connectivity.

I'm hoping one of you networking gurus can tell me what I need to do with SIMH in order to get similar functionality as I have with KLH10/tun.

larsbrinkhoff commented 5 years ago

I had the same alarming experience when I tried to experiment with Ethernet bridges on my VPS machine. Eventually I cooked up the script below which seems to do what I need. I'm not a networking expert either, so I may well have gotten some details wrong.

https://github.com/PDP-10/its/blob/master/build/pdp10-ka/tap.sh

eswenson1 commented 5 years ago

I decided to try your script -- although I've done what I believe to be equivalent steps many, many times already. The result was the same as with my manual attempts. Once I've setup the bridge and brought up the eth0, tap0, and br0 interfaces, I can no longer SSH to my host. And any process on the host (e.g. from the Linode root console interface) can no longer access the Internet.

Your script is making some assumptions -- one is that there exists an accessible DHCP server that can acquire a new IP address for br0. The "dhclient" request sends lots of DHCP requests, but never gets an address for my Linode VPS. My guess is that I am not allowed to get another IP address through DHCP.

The other assumption you are making is that the host will allow arbitrary MAC addresses on interfaces connected to the host network. In my case, I don't think this is the case. I tried messing with your config to work around lack of DHCP and to set the MAC address on br0 and tap0, but none of these had any positive effect.

I've never been able to set up a bridge between a tap interface and my eth0 interface.

rcornwell commented 5 years ago

First you will need to work with Linode to get a second IP address for your account, and either give them or have them give you a MAC address. The MAC address is not really attached to the bridge or tap. Linnode probably also does not allow arbitrary IP addresses on their net.

Note NAT interface is the same as Tun interface, but does not require the extra iptables to set up.

If you wish to have a simulator hooked up to a TAP interface on Linode you will have to work with their support department to allocate you an IP. They can probably also add additional configuration operations to your system to auto create a group of TAP interfaces.

I do not know why I need to add the entry to /etc/resolv.conf, however without it DNS would not work.

eswenson1 commented 5 years ago

I did ask Linode support to give me another IP address -- they did -- a private IP address. It appears to be associated with the eth0:1 (alias for eth0?). I could ask for, and pay for, a second public ip address. I haven't done this (because I'm cheap).

Do you think the private IP address will work, or will it have to be a public IP?

I could ask them about a second MAC address. I'll give them the minimum sequence of commands that cause my eth0 to no longer work and ask them why and what I need to do to fix it.

rcornwell commented 5 years ago

I think your best answer is NAT. This also exposes ITS the least. If it is up on a TAP it could be subject to attack. Having the ports at non-standard locations is probably the most secure.

eswenson1 commented 5 years ago

Ok. Thanks, Rich. I did have this working with NAT. What I didn't like was the fact that I had to map all the ports. But I guess that was a minor inconvenience. I thought, perhaps, with the latest fixes from Mark, that I could use pdp10-ka without a tap device -- in other words directly, using the first of his two networking options. I didn't understand Mark's comment about this working on Windows but not Linux.

I think what you're saying is that the issues stem from Linode limitations (no DHCP acquisition of additional IP addresses, no additional public IP address, no allowance for unapproved Linode MACs and IP addresses, etc.). And that if Linode supported these, I'd be able to use a tap device.

I'll go back to using nat -- a setup I had working fine (thanks to you) before all these attempts to try alternatives.

rcornwell commented 5 years ago

I think I see the problem with your patch. You never start the 2nd unit. I am not sure where you want this be be started or I would add the line.

Also be careful with memcpy of IP addresses, this can lead to weird issues if endiness is different.

markpizz commented 5 years ago

I think I see the problem with your patch. You never start the 2nd unit. I am not sure where you want this be be started or I would add the line.

I changed nothing about how the second unit is started. It is started, as you wrote it, in imp_send_packet() when outgoing data is transmitte.

memcpy of IP address is correctly done and platform independent. Look at it this way: in_addr_T is a 32bit scalar value which is a proxy for a struct in_addr which is a 32 bit structure whose contents are always represented in network byte order. The 32bit scalar easily lends itself to 1) direct value equality comparisons, and 2) manipulation by masking with network masks that are also contained in in_addr_T scalars in network byte order. memcpy is used rather than a direct assignment of a value in a structure since some IP addresses may be present in from the wire structures where the IP address element may not be naturally aligned on a 32 bit boundary, and on some host platforms an unaligned reference may have negative performance impact or may even fault.

The only place where 32bit packet presented values are interpreted numerically are the lease, renew and rebind times and in those cases, the value is first copied to the int element with memcpy for alignment reasons, and then converted to host byte order with ntohl().

rcornwell commented 5 years ago

Ok... you have the DHCP run off unit[2] of the IMP device. I see no sim_activate anywhere for unit[2]. imp_dhcp_discover was called in imp_timer_task which was run every time imp_eth_srv was called. This task was designed to handle timeouts for resends, DHCP and arp entry timeouts. You created a third unit for IMP to handle DHCP and arp aging requests. However you never schedule this task to run hence it never gets kicked off to try and gain a DHCP address. A search of your file shows no references to imp_unit[2].

markpizz commented 5 years ago

Actually, I used a reference I saw you do in imp_detach() and referred to uptr+2 in the second to last line of imp_attach(). The only attachable unit is unit[0], so in imp_attach(), uptr+2 is unit[2].

markpizz commented 5 years ago

A SHOW QUEUE command immediately after the ATTACH IMP command will show that this unit is scheduled.

rcornwell commented 5 years ago

Ok. Then I am at a lose to explain why I am not seeing DHCP packets being sent out. And the interface remaining in idle mode.

When ITS starts up it will send 3 nop's to the IMP and wait for 3 nop's back.

KA-10 event queue status, time = 1264410536, executing 47,307,480 instructions/sec CPU at 788427 (16.666 msecs) (Idle capable) MTY at 788427 (16.666 msecs) (Idle capable) DPK at 788427 (16.666 msecs) (Idle capable) IMP unit 1 at 788427 (16.666 msecs) (Idle capable) TEN11 at 788427 (16.666 msecs) (Idle capable) AUXCPU at 788427 (16.666 msecs) (Idle capable) CTY unit 1 at 788427 (16.666 msecs) (Idle capable) TK at 788427 plus 10566677 usecs (10.583343 seconds total) (Idle capable) asynchronous pending event queue IMP unit 0 event delay 0 asynch latency: 4000 nanoseconds asynch instruction latency: 189 instructions

I do not see IMP unit 2 in this list.

markpizz commented 5 years ago

Is the host system running under your bridge configuration or merely using the default network config.

To simply test the DHCP stuff, I'd start with the default OS network config and:

   sim> SET  IMP HOST=ddd.ddd.ddd.ddd
   sim> ATTACH IMP enp3s0

This should test DHCP to the IMP on your LAN interface. You should be able to see that traffic with Wireshark.

rcornwell commented 5 years ago

This is not possible, since my home directory is over NFS, I can't point the link directly to the interface.

Under wireshark I see zero packets from the IMP other then the initial loopback probe.

rcornwell commented 5 years ago

I solved the problem. The sim_activate_after needs to be moved to imp_reset. I will push this out later today.

markpizz commented 5 years ago

Right.

However, it should be conditional on (imp_unit[0].flags & UNIT_ATT).

I didn't see this problem since my test cases all started execution with GO which doesn't clear the event queue before execution starts.

rcornwell commented 5 years ago

Ok. I will push out changes later. I also folded a couple very long lines. I like to keep line lengths under 90 characters or even 80 characters if possible.

markpizz commented 5 years ago

@eswenson1 said:

I have a Linux Linode with a single NIC (eth0) with a single (public) ip address. As far as I know, my Linode will not allow me to use DHCP to get another IP address. With tun, I can trivially set up (actually KLH10 sets up) an ITS that can connect to other hosts on the Internet and that can be reached from the Internet. This is what I desire to do with pdp10-ka (SIMH). All I had to do with KLH10 was to configure iptables to do forwarding from my external NIC (eth0) to my tun device (to which KLH10 attached). I was able to use iptables to allow external access to select ports on ITS. And I was able to allow ITS to access my local SMTP server (running on Linux), as well as make outbound connections to Internet hosts.

This, along with other things you've said about the restrictions in your setup (having a single public IP) says that you must be already solving the problem yourself in your iptables setup.

What I mean here is that there can't be any systems beyond the host (on the host's LAN or further afield on the Internet) which can reach BOTH the host system AND the KLH10 (or pdp10-ka) on the SAME port. The iptables setup must be directing incoming traffic for any particular port to EITHER the host's network stack, OR into the tun (KLH10), but NOT BOTH.

Within your host system the tun (or a potential tap configuration) allows communications between the host and the simulated system. iptables must be providing a means of letting some external traffic arrive inbound for the host system (i.e. incoming SMTP) while directing some other traffic towards the simulator.

Given the fact that your Linode VPS is an island to itself, and not merely present on a LAN which you've got other systems running on, you really wouldn't be a candidate to setup a bridge for traffic to/from the simulator and the host system. Rich's model has several potential systems running in the host (hence the tap0, tap1, etc...) and also has other systems on the LAN that may want/or need to be reachable.

So, as I've said before, getting back to your tun vs tap question and having a simh simulator working as well as the KLH10, I'm 100% sure I can come up with a recipe that can do the same for tap under simh as tun is working under KLH10. In order to figure out that recipe I'll have to poke around in a system that is running KLH10 and possibly have access to the source that the running KLH10 simulator was built with. If you can get me access to a working system like that I'll be glad to work things out.

rcornwell commented 5 years ago

To access the various ITS builds, just git clone the ITS repo and run make with the correct EMULATOR variable. You can configure it for KLH10(KS), simh(KS), pdp10-ka(KA). The setup is very automatic. Just "make EMULATOR=??? clean all" about 1 hour later you have yourself a complete working ITS system.

larsbrinkhoff commented 5 years ago

No, make clean and git clean -xfd will not clean the subrepositories.

markpizz commented 5 years ago

To access the various ITS builds, just git clone the ITS repo and run make with the correct EMULATOR variable. You can configure it for KLH10(KS), simh(KS), pdp10-ka(KA). The setup is very automatic. Just "make EMULATOR=??? clean all" about 1 hour later you have yourself a complete working ITS system.

gets me something built locally, but not configured or easily testable for the situation that @eswenson1 wants to work for the particular network connection. I'm looking for a working setup that can have the various pieces tweaked a little to work with simh (which might need to be extended slightly), or at least have a documented procedure that gets him equivalence for KLH10 and pdp10-ka.

eswenson1 commented 5 years ago

@markpizz You are correct that on my VPS with ITS under KLH10 and external access to/from ITS on this setup, I'm using iptables to do the IP forwarding and rules to specify which ports should be forwarded from outside to which ports on the ITS host.

I'm perfectly happy doing the same thing with ITS under SIMH with tap -- if I could get it to work. So far, when I create a tap device, and let ITS (SIMH) connect to it, I'm unable to to get any traffic from my host to ITS (or the other way around). If I could get this far, then I could use Iptables to forward inbound/outbound traffic to the Internet. This is, of course, all that matters in the end. I don't care whether it is tun or tap, but I just need to be able to get ITS hosted on my Linux VPS without the need for a second NIC, or a second IP address, or a bridge (which won't work because I need a new MAC address that is acceptable by Linode, and they can't give me one -- I asked).

So if someone could (without a bridge) get ITS running under SIMH with tap, and figure out what magic needs to be done to allow local access from the host to ITS and vice-versa, then I'm sure I can get the iptables magic for external access -- just like I have for KLH10.

eswenson1 commented 5 years ago

I've tried this (most recently), with no success:

sudo tunctl -u eswenson -t tap0
sudo ifconfig tap0 10.3.0.1 up
echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward
sudo iptables -A FORWARD -i tap0 -j ACCEPT
sudo iptables -A FORWARD -o tap0 -j ACCEPT
sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

And with that, I should be able to have traffic forwarded to the tap0 device, whose address is 10.3.0.1. I then configure SIMH thus:

set imp enabled
set imp mac=2a:4b:23:2e:12:67
set imp mpx=4
set imp ip=10.3.0.1
set imp gw=66.175.218.1
set imp host=10.3.0.6
#set imp dhcp
at imp tap:tap0

Where GW is the actual gateway on eth0 that my host uses to get to the internet. HOST is the IP of my ITS system. And IP is the tap0 IP address. When I bring up ITS, I cannot do "telnet 10.3.0.6" (or "supdup 10.3.0.6") from my Linux host.

markpizz commented 5 years ago

Thanks for the example that doesn't work. I do see some potential problems.

Meanwhile, what are all of the equivalent things for the KLH10 case that does work?

Your case is more complicated than the simpleat example.

I'm really looking for the working example that I can map to the equivalent stuff with simh.

I think we are 6 timezones apart right now. I'll be up by around noon EDT tomorrow, so anything you can provide by then will be helpful.

Ifconfig -a

As well as the output of the touring table on this Linux box. I can't recall the command without poking around.

eswenson1 commented 5 years ago

Ok. I can make a tarball of a bootable ITS under KLH10 and provide you with the relevant config that allows connecting to ITS from the host.

No special iptables setup is required. KLH10 creates the tun interface and configures everything itself.

I do use iptables to allow forwarding and NAT from outside my host (Internet) and ITS, but that is really immaterial — if I can route local traffic from my host to ITS (and vice versa) then I can do the iptables magic to expose on the Internet.

The issue with SIMH for me is that I can’t even access ITS with tap from my host, so anything more complex isn’t going to work either.

eswenson1 commented 5 years ago

Oh, and to display the routes on Linux, you can use “route -n”, where the -n just specified to display addresses numerically, rather than trying to look up hostnames.

rcornwell commented 5 years ago

One thing you might want to do is take my NATing code for FTP and add it into Slirp so that FTP will work under nat interface. I also have pretty fast code for updating the TCP checksums.

eswenson1 commented 5 years ago

I've placed a tarball of a bootable KLH10-based ITS system here: https://s3.amazonaws.com/eswenson-its/public/mark-klh10.tgz.

If you look in dskdmp.ini, you'll see the config and the IP address (192.168.1.100) for which the ITS is configured.

To start it, type: "sudo ./kn10-ks-its dskdmp.ini"

At the KLH10 prompt, type "go". Then, you will end up in DSKDMP. Type "its" followed by "g". ITS should start up. Ignore the "top level interrupt" message you'll see. You should be able to "supdup 192.168.1.100" or "telnet 192.168.1.100" from your host. No other setup should be required.

markpizz commented 5 years ago

@eswenson1 On your running KLH10 setup, will you please provide the output of:

$ Ifconfig -a
$ route -n
eswenson1 commented 5 years ago

ifconfig -a:

dummy0    Link encap:Ethernet  HWaddr 5e:c9:16:ee:cb:8e
          BROADCAST NOARP  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

erspan0   Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          BROADCAST MULTICAST  MTU:1450  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth0      Link encap:Ethernet  HWaddr f2:3c:91:d4:29:2d
          inet addr:66.175.218.130  Bcast:66.175.218.255  Mask:255.255.255.0
          inet6 addr: 2600:3c01::f03c:91ff:fed4:292d/64 Scope:Global
          inet6 addr: fe80::f03c:91ff:fed4:292d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:35488950 errors:0 dropped:0 overruns:0 frame:0
          TX packets:31368510 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3163516197 (3.1 GB)  TX bytes:130838388457 (130.8 GB)

eth0:1    Link encap:Ethernet  HWaddr f2:3c:91:d4:29:2d
          inet addr:192.168.219.106  Bcast:0.0.0.0  Mask:255.255.128.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

gre0      Link encap:UNSPEC  HWaddr 00-00-00-00-80-00-00-00-00-00-00-00-00-00-00-00
          NOARP  MTU:1476  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

gretap0   Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          BROADCAST MULTICAST  MTU:1462  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ip6_vti0  Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          NOARP  MTU:1364  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ip6gre0   Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          NOARP  MTU:1448  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ip6tnl0   Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          NOARP  MTU:1452  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ip_vti0   Link encap:IPIP Tunnel  HWaddr
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:760866 errors:0 dropped:0 overruns:0 frame:0
          TX packets:760866 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:43559858 (43.5 MB)  TX bytes:43559858 (43.5 MB)

sit0      Link encap:IPv6-in-IPv4
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

tap0      Link encap:Ethernet  HWaddr aa:5f:40:24:d7:77
          inet addr:10.3.0.1  Bcast:10.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::a85f:40ff:fe24:d777/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:127 errors:0 dropped:20 overruns:0 frame:0
          TX packets:70 errors:0 dropped:2 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:18585 (18.5 KB)  TX bytes:5072 (5.0 KB)

teql0     Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          NOARP  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

tun0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
          inet addr:66.175.218.130  P-t-P:192.168.1.100  Mask:255.255.255.255
          inet6 addr: fe80::7027:cbb:a73c:c91e/64 Scope:Link
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 B)  TX bytes:48 (48.0 B)

tunl0     Link encap:IPIP Tunnel  HWaddr
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

route -n:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         66.175.218.1    0.0.0.0         UG    0      0        0 eth0
0.0.0.0         66.175.218.1    0.0.0.0         UG    0      0        0 eth0
10.0.0.0        10.3.0.1        255.0.0.0       UG    0      0        0 tap0
66.175.218.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.1.100   0.0.0.0         255.255.255.255 UH    0      0        0 tun0
192.168.128.0   0.0.0.0         255.255.128.0   U     0      0        0 eth0
larsbrinkhoff commented 5 years ago

I'm updating my "TT" ITS to use the latest pdp10-ka. I will try to have it use the TAP interface.