SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
240 stars 90 forks source link

OSA VIPA feature incompatible with multiple OSA device emulation. #485

Open bw-gh opened 2 years ago

bw-gh commented 2 years ago

In a 2 interface situation using OSA devices at 0400.3 and 0404.3, with some emulated system activating both interfaces, SETIP(+VIPA) may be received for the adapter opposite the adapter intended in addition to SETIP (DEFAULT) for the intended adapter. This can render the host network configuration unusable. Given the limited runtime and emulation goals of Hercules, it seems that VIPA SETIP commands should be rejected.

I have verified that rejecting VIPA SETIP allows successful network operation on both interfaces, but since there were explicit tests for VIPA in qeth.c, I'm looking for a wider opinion before creating a pull request.

Built from my fork of this project, based on the develop branch:

HHC01417I ** The SoftDevLabs version of Hercules **
HHC01415I Build date: Apr  6 2022 at 19:35:29
HHC01417I Built with: GCC 10.2.1 20210110
HHC01417I Build type: GNU/Linux x86_64 host architecture build
Fish-Git commented 2 years ago

SETIP(+VIPA) may be received for the adapter opposite the adapter intended in addition to SETIP (DEFAULT) for the intended adapter.

Forgive me for asking what is probably a dumb question (I am not experienced/familiar with VIPA on "the emulated system" that I suspect you're using), BUT... Isn't it the responsibility of the guest O/S (i.e. "emulated system") to direct its OSA command(s) to the proper/desired device?

In other words, how can a "SETIP(+VIPA)" be received on "the adapter opposite the adapter intended" unless "the emulated system" specifically issued the command to that adapter?

I have not reviewed Hercules's QETH/OSA handler logic before posting this response, but AFAIK, Hercules does not arbitrarily choose which OSA adapter a given command is processed for. AFAIK, command packets are sent by the guest to whichever adapter (device address) it wants to send them to, and AFAIK Hercules processes such command packets for that adapter. So how can such "SETIP(+VIPA)" (command packets?) be processed on the "wrong adapter" as you claim, unless the guest O/S ("emulated system") specifically sent it there on purpose?

Again, I'm not that familiar/experienced with using OSA devices, but are you sure you have them configured correctly in your "emulated system" (guest OS)?

Given the limited runtime and emulation goals of Hercules, it seems that VIPA SETIP commands should be rejected.

Why? Are we not doing what real OSAs do? Because if we're not, then IMO we need to fix that, not rip it out!

but since there were explicit tests for VIPA in qeth.c, ...

I'll take a look. But again, I'm not all that familiar with VIPA and how it works (or is supposed to work) on OSA devices. In the mean time, can you share your proposed change(s) with us? What's the URL of your fork you intend to submit a pull request for?

bw-gh commented 2 years ago

VIPA provides the means for network resiliency in the face of adapter failure, media failure ... you get the picture. It allows the OSA stack to choose transport alternatives.

The project qeth code is treating a VIPA SETIP as if it were a regular primary IP ... which it is not; to address your specific question: we are definitely NOT doing what real OSA's do, nor should we for our more limited goals.

This is not an observation of a CCW being directed to the wrong device, it's a case of qeth treating a special request in the same way as a primary request for a given device. I'm quite sure my emulated system is configured correctly, and I am quite familiar with the overall environment. My point about Hercules goals was simply that as cool a tool as it is, it has no real place in emulating mainframe resiliency scenarios.

My local fork code rejects the VIPA SETIP, but one could equally argue for a (non-debug) silent ignore since in either circumstance, normal operation is not affected (no emulation of VIPA behaviors exists, so either option could be construed as acceptable).

One could also legitimately decide there's a reason to emulate some of the special OSA VIPA behaviors in some future development effort and use a configuration keyword to enable/disable such processing ... but IMHO there isn't a justification for attempting the emulation of VIPA resiliency; we should just ensure it doesn't disrupt normal operation.

mcisho commented 2 years ago

we are definitely NOT doing what real OSA's do

Are you saying that as someone who does know what real OSA's do? I am the author of parts of qeth and I definitely don't know what real OSA's do, and neither, I believe, do any of the other authors. All any of us know for sure is what the Linux kernel source code implies, and what empirical evidence suggests. If you have definitive knowledge, then please tell us what we should be doing and why!

IMHO, implementing VIPA was a bad idea, but other's believed it was a good idea, and, as I understand it, now use it successfully.

As a *nix user I avoid VIPA breaking my networking by using preconfigured tuntap interfaces and running Hercules without privileges.

bw-gh commented 2 years ago

I am not an OSA firmware developer, but VIPA and OSA external behaviors are well described in IBM documentation.

Like you, I use pre-configured tuntap interfaces on *nix platforms, and when I configured 2 emulated OSA device groups, I observed that after each tunnel interface is assigned the appropriate destination address upon receipt of SETIP+DEFAULT, one or more of the devices can also receive SETIP+VIPA requests for the opposite adapter's address. Since we treated SETUP+VIPA in qeth in the same path as SETIP+DEFAULT, the VIPA request invokes hercifc to change the tunnel destination IP to the wrong address.

As long as we have no VIPA behavioral emulation, we should at least take steps to not treat VIPA as SETIP+DEFAULT requests.

I'm not making an argument against taking on a development effort that would add additional emulated behaviors to qeth, I'm just pointing out that until there's VIPA emulation support, we shouldn't accept the SETIP+VIPA request and treat it like SETIP+DEFAULT. In my particular case, I chose to reject the request; but as I mentioned in the previous note one could also silently accept and respond to the request but do nothing without issues.

mcisho commented 2 years ago

The home IPv4 address of each and every defined device/link or interface (OSA, VIPA, what ever) is sent to each and every OSA as a VIPA address. This is how z/OS's TCP/IP stack(s) seem to operate. It makes sense for z/OS, as it's offloading all that boring inbound traffic management to the OSA(s).

bw-gh commented 2 years ago

Exactly, and since we don't emulate that part of OSA in qeth, we shouldn't act on VIPA requests.

rgschmi commented 2 years ago

I'm not sure what your environment is, but I would hate to see any VIPA support disabled. I have a single OSA and multiple VIPAs with Hercules under Windows running. The VIPAs are in separate subnets from the OSA adapter so I'm using OSPF. I do agree that defining two OSAs causes problems.

I have to admit I'm still using LCS interfaces on Linux, so I don't know if my OSA configuration would work there. The only 'real-world' behaviour I've found is Windows with multiple LCS adapters and multiple VIPAs, all in the same subnet. ARP takeover and giveback and VIPA owner all work. CTCI-WIN does not support multicast with LCS interfaces, so no OSPF support so single subnet required.

Fish-Git commented 2 years ago

It sounds to me like we should probably do what bw-gh suggests: accept the request but otherwise do nothing. At least maybe do that only on Linux? Or maybe make it a device statement option? (i.e. a new OSA/QETH configuration file device statement parameter/option?) E.g. vipa=yes|no? (or vipa=acc/rej?)