SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
240 stars 90 forks source link

Request for Native SNA support #348

Closed vinatron closed 2 years ago

vinatron commented 3 years ago

Hello,

I have been a community member for a few years and now have some hardware of my own. This really isn't an issue, however I would like to have SNA support in Hercules to be able to test SNA typologies without having to keep my frame online.

I am interested in assisting with adding support in any way I can because the original system had an OSA adapter defined in OSC mode for SNA console access. If I can help contribute any data to help with the addition of SNA into Hercules, please let me know.

I don't however have a communications controller. If I did, I would give data on those as well to have Native SNA on the 370 as well. My end goal is to have a Cisco and IBM SNA network going between a client node and my AS/400 and Mainframe systems, and use Communication Tools MVS/VM bridge on the AS/400 to setup NJE connections over SNA because that's how the manual explains the connection process.

Thanks for the continued development on the software. It is greatly appreciated. Also, any additional data you may need, my machine isn't the newest, however I can give any data that you may need. It's a z114-M05 with 2 CPs and 8GB of storage "RAM". I also have plans to try and connect my ESCON tape device and see if I can transfer data to real tapes.

Kind Rearguards, Vincent

Fish-Git commented 3 years ago

Hi Vincent!

We currently have no plans to add SNA support to Hercules, but that's probably only because we're short on manpower and already have more "To Do" items on our list than is possible to finish in a lifetime! Nevertheless, we will keep your request open. You can never know when someone might come around and decide to take a crack at it.   ;-)

Thanks for the enhancement request and enjoy your personal mainframes!

vinatron commented 3 years ago

You're very welcome. I will definitely provide any information I need, SNA related or not.

vinatron commented 3 years ago

I made a mistake in the adapter classification I suppose I should correct it. As stated before I said OSC that is an error that's like the current 3270 emulation that's done by Hercules. What I meant was OSE which is OSA non-QDIO which is capable of doing SNA traffic. And for reference OSD would be QDIO mode if anyone wanted to know.

PoC-dev commented 3 years ago

Abstract

These are the collected findings as Documented in the groups.io Mailing List H390-MVS at the end of January 2021 regarding Implementation of SNA support in Hercules-Hyperion to use 802.2 DLC on Ethernet via TAP device.

Until then, consent between all devs was: There is no SNA support in Hercules.

Documentation for the LCS device hinted to partial SNA support, because the OAT file might contain a SNA directive. Looking at the ctc_lcs.c file revealed that a basic handling of SNA frames is already implemented. It is not known to which extent. This was my motivation to clarify this ambiguity.

Ian states:

QETH is OSD only. SNA would use an OSE(?), which is pretty much identical to LCS.

So I didn't bother to even look at QETH.

Testbed configuration includes

Thanks to (in no particular Order) Fish, Harold Grovesteen, and Ian (no last name known) for helping to shed some light on that issue.

Hercules configuration details

Hercules has been configured with a stock LCS device. Only one, because according to Ian, only TCP/IP support needs two devices: One for reading from and one to write into. E40 is documented as AWS3172, VTAM 3172 emulation in the original P390 configuration file, so I assume the HCD configuration of the guest OS is correct for that type of device also.

0E40 LCS --oat hercules.oat --debug

The OAT file provides a means to route packets to the available ports of the virtual 3172.

*********************************************************
* Dev   Mode  Port  Entry specific information          *
*********************************************************
  0E40  SNA   00
  HWADD 00 02:00:FE:DF:00:42

After the start of Hercules, a virtual device tap0 is automatically created. I bridge this device to an unused NIC in my test equipment:

ip link add br0 type bridge
ip link set dev eth0 up
ip link set dev eth0 master br0
bridge link set dev eth0 state 3
ip link set dev tap0 up
ip link set dev tap0 master br0
bridge link set dev tap0 state 3

Maybe the link state settings are not necessary, because they are handled by STP. STP is off by default.

Guest OS

Before trying anything, I applied the change regarding a missing interrupt handler as laid out in the New User's Cookbook accompanying the distribution.

MIH

It required changing the parmlib member IECIOS00:

MIH TIME=00:00,DEV=E40

I verified successfully that it works after the next IPL, with D IOS,MIH,DEV=E40:

0E40=00:00.

Note: I have added the same configuration change to the E20-E21 devices (CTCI), used solely for IP connectivity before I tried to get E40 running. So far, I've not observed adverse effects.

VTAM

VTAM configuration is mainly unchanged from the original file in sys1.local.vtamlist(xcae40e). The comment in that file states: 3172 (LSAPLAN) ETHERNET ADAPTER1 (E40). Definition:

XPE40E PORT CUADDR=E40,ADAPNO=1,SAPADDR=4,MEDIUM=CSMACD,DELAY=0,TIMER=30

Note: In the original member, this line is split to fit in one line.

The OAT documentation mentions ports starting with 0, while the VTAM configuration mentions ADAPNO=1 by default. It is not yet clear if this is an off-by-one glitch, or another error condition/inconsistency. Further testing might be necessary.

Findings

Trying to activate this major node in VTAM immediately raises an error in the guest OS console:

IST1023E START I/O TIMEOUT OCCURRED FOR CUA=0E40

The LCS debug output revealed, that VTAM sends a command frame which is not recognized by the command handler:

HHC00933D 0:0E40 CTC: executing command other (0x41)

I extended the code for ctc_lcs.c and the header file defining the commands, so command 0x41 is redirected to a TCP/IP Start LAN command handler, and 0x42 to the accompanying Stop LAN command handler. See attached diffs:

With help from Ian and Fish, I managed to get some debugging output, including a CCW trace (type t+0E40 into console, in addition to the already enabled debug mode of the LCS driver).

This is the output while IPL was progressing. Obviously some guest OS driver probing or even initializing the hardware:

HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F50E8E0
HHC01315I 0:0E40 CHAN: ccw 14300001 00000000
HHC01312I 0:0E40 CHAN: stat 0C00, count 0001
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 0001, ccw 1F50E8E8
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F50E8F0
HHC01315I 0:0E40 CHAN: ccw E4200100 1F50E8F8
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F9=>FF308860 30880100 00000000 00800100 ..h-.h..........
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 00F9, ccw 1F50E8F8
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F503DA0
HHC01315I 0:0E40 CHAN: ccw E4200100 1F503728
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F9=>FF308860 30880100 00000000 00800100 ..h-.h..........
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 00F9, ccw 1F503DA8
HHC01318I 0:0E40 CHAN: test I/O: cc=0

The activation of the XCAE40E resource yields the following output:

HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:0AAF27F8
HHC01315I 0:0E40 CHAN: ccw 03300001 0AAF27F8
HHC01312I 0:0E40 CHAN: stat 0C00, count 0001
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 0001, ccw 0AAF2800
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01332I 0:0E40 CHAN: halt subchannel
HHC01300I 0:0E40 CHAN: halt subchannel: cc=0
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00802001, stat 0C00, count 0001, ccw 0AAF2800
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702358
HHC01315I 0:0E40 CHAN: ccw C3200001 00000000
HHC01312I 0:0E40 CHAN: stat 0C00, count 0000
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60804007, stat 0C00, count 0000, ccw 0B702360
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702360
HHC01315I 0:0E40 CHAN: ccw E42000FF 0B702398
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F8=>FF308860 308801E3 D3E2C3F6 C640F9F9 ..h-.h.TLSC6F 99
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60804007, stat 0C00, count 00F8, ccw 0B702368
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702368
HHC01315I 0:0E40 CHAN: ccw 01200018 0B702498=>00160000 41000000 00000000 00000000 ................
HHC00981D 0:0E40 LCS: Accept data of size 24 bytes from guest
HHC00979D LCS: data: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
HHC00979D LCS: data: +0010< 00000000 00000000                    ........         ........        
HHC00922D 0:0E40 CTC: lcs command packet received
HHC00979D LCS: command: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
HHC00979D LCS: command: +0010< 00000000 0000                        ......           ......          
HHC00933D 0:0E40 CTC: executing command start lan (sna)
HHC02499I Hercules utility hercifc - Hercules Network Interface Configuration Program - version 4.3.0.0-SDL
HHC01414I (C) Copyright 1999-2020 by Roger Bowler, Jan Jaeger, and others
HHC01417I ** The SoftDevLabs version of Hercules **
HHC01415I Build date: Jan 26 2021 at 21:04:13
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
[Repeated endlessly]

After three minutes, a timeout value was reached for minor nodes of XCAE40E: STC00004 IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK,SENSE: 081C0000

This console message is accompanied by this debug output of Hercules:

HHC01332I 0:0E40 CHAN: halt subchannel
HHC01300I 0:0E40 CHAN: halt subchannel: cc=0
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60806001, stat 0000, count 0000, ccw 0B702370
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
[Repeated endlessly]

The inactivation of the resource yields no more channel communication to the LCS.

There never was one single frame visible in a tcpdump running on the associated tap interface.

Forcing an AS/400 to send XID frames to the MAC of the LCS unsurprisingly yielded no reaction, neither from Hercules Tracing, nor from VTAM.

Summary

PoC-dev commented 3 years ago

Hello Vincent,

My end goal is to have a Cisco and IBM SNA network going between a client node and my AS/400 and Mainframe systems, and use Communication Tools MVS/VM bridge on the AS/400 to setup NJE connections over SNA because that's how the manual explains the connection process.

This matches exactly my goal, and thus I'm highly interested in getting in touch with you! My AS/400 model 150 runs V4R5 (no license keys needed in this particular combination). Since Hercules doesn't support 802.2 DLC, and EE was first available on the iSeries platform with V5R3 or V5R4, there's no direct way to connect Hercules and my 150.

What exactly is the Cisco (Router?) supposed to do? I know some images (or licenses on newer hardware) provide a SNAsw branch extender node support. Depending on the scenario being built, this is a serious limitation of using NNs on both sides of the logical SNAsw ports (EE, and 802.2 DLC). Currently I'm using this feature to tinker with VTAM and OS/400. Details on Request. You can find my mail address on my Github profile page.

:wq! PoC

vinatron commented 3 years ago

Nice work!

I sent an email with my system details. If you don't get it, my AS/400 system is a 9406-270 running V5R4 and my Cisco Router is a 2811 with a SNAsw image loaded on it.

I plan on making a links to other systems on the internet and want to use the Cisco router to handle the SNA traffic coming in over DMVPN links. If you want to join when the network is rebuilt, let me know.

mcisho commented 3 years ago

Hello Patrik,

A couple of questions. Firstly, are you in the process of doing the changes for SNA on LCS? Secondly, how did you manage to make VTAM use the LCS?

As a result of your interest I've been looking at the config code for LCS, with the intention of firstly allowing SNA LCS's to be defined without using the OAT, and secondly to allow the OAT to specify preconfigured tap interface names. I made the changes to LCS to support preconfigured interfaces ages ago, but decided to leave the OAT until there appeared to be some demand.

I'm attempting to reproduce the results you have achieved. I have an XCA (I copied the one you published in one of the H390-MVS messges) which activated successfully, but does nothing with the LCS. Do I need other VTAM definitions, e.g. an SWNET? My understanding is that an XCA simply defines the hardware to be used for a connection, but the hardware isn't used until a connection is started. Is that correct? I know next to nothing about VTAM resource definition, so any help would be greatly appreciated.

Ian

PoC-dev commented 3 years ago

Hello Ian,

Thanks for participating!

A couple of questions. Firstly, are you in the process of doing the changes for SNA on LCS?

I'd love to do, if I only knew what needs to be changed. :-)

As a result of your interest I've been looking at the config code for LCS, with the intention of firstly allowing SNA LCS's to be defined without using the OAT, and secondly to allow the OAT to specify preconfigured tap interface names. I made the changes to LCS to support preconfigured interfaces ages ago, but decided to leave the OAT until there appeared to be some demand.

Thanks a lot!

Do I understand right that a preconfigured interface includes the need to provide a MAC address for the interface by facilities of the hosting OS, because can't set the MAC via OAT anymore?

And, does your change permit a possible mixed mode between SNA and IP via LCS?

I'm attempting to reproduce the results you have achieved. I have an XCA (I copied the one you published in one of the H390-MVS messges) which activated successfully, but does nothing with the LCS.

This is the sole, and full member I'm using for my tests. I've only commented out the minor nodes 01..06.

*/* ------------------------------------------------------------------
*/*
*/* 3172 (LSAPLAN) ETHERNET ADAPTER1 (E40)
*/*
*/* ------------------------------------------------------------------
*/*
XCAE40E  VBUILD TYPE=XCA
*/*
XPE40E   PORT  CUADDR=E40,ADAPNO=1,SAPADDR=4,MEDIUM=CSMACD,            -
               DELAY=0,TIMER=30
*/*
XGE40E   GROUP DIAL=YES,CALL=IN,ANSWER=ON,ISTATUS=ACTIVE,              -
               DYNPU=YES
*/*
*/* ----------------------------------------- XCAE40E PERIPHERAL NODES
*/*
XGEL00    LINE
XGEP00      PU
*/*
*XGEL01    LINE
*XGEP01      PU
*/*
*XGEL02    LINE
*XGEP02      PU
*/*
*XGEL03    LINE
*XGEP03      PU
*/*
*XGEL04    LINE
*XGEP04      PU
*/*
*XGEL05    LINE
*XGEP05      PU
*/*
*XGEL06    LINE
*XGEP06      PU

Do I need other VTAM definitions, e.g. an SWNET? My understanding is that an XCA simply defines the hardware to be used for a connection, but the hardware isn't used until a connection is started. Is that correct?

Honestly, I don't know for sure. I've enabled VTAM to be APPN aware, by setting NODETYPE=NN,CONNTYPE=APPN ATCSTR00. This in turn enables incoming connections to be accepted, as I could verify with an EE connection.

To my understanding (with a heavy bias towards APPN), activating the XCA40E resource must prepare the hardware to accept incoming connections, even if there's no switched major node for an outbound connection.

I know next to nothing about VTAM resource definition, so any help would be greatly appreciated.

Welcome to the club of the clueless. ;-) I have been reading these manuals, at least in parts:

Probably also helpful:

My finding so far is that understanding VTAM needs SNA expertise, and understanding SNA needs VTAM expertise. I have not yet managed to break this tie.

There is an awful lot of general SNA documentation available. Because SNA has come a long way, documentation is plentiful. I don't know yet it it's of any help to read all from oldest to newest for getting a better understanding about the development taking place over decades.

:wq! PoC

vinatron commented 3 years ago

My first introduction to SNA was Communication Tools for AS/400.

Now I have started to learn about the zSeries VTAM system, so it's a process I need to get OSA/SF working to activate my non-QDIO OSE adapters apparently. But so far I've gotten SNA on OS/2 AS/400 Cisco SNAsw and Windows SNA server. But yes, it makes it easier when you have Hercules to work with SNA so I don't have to run my 2KW z114 system to test the routing.

Keep up the good work, and if you need any information from my system, please let me know. I will try my best to provide all I can.

mcisho commented 3 years ago

Do I understand right that a preconfigured interface includes the need to provide a MAC address for the interface by facilities of the hosting OS, because can't set the MAC via OAT anymore?

Yes. The following are the commands (from a script) I use to set up a preconfigured tap interface on a Fedora host for use with Hercules:

sudo ip tuntap add mode tap dev tap211
sudo nmcli dev set tap211 managed no
sudo ip link set dev tap211 address c6:d6:c7:0e:22:80
sudo ip -4 addr add dev tap211 192.168.232.104/26
sudo ip link set dev tap211 mtu 1500
sudo ip link set dev tap211 up

And, does your change permit a possible mixed mode between SNA and IP via LCS?

Yes and no. If you want to use a single tap interface for both IP and SNA via LCS, you would have to use the OAT to define the IP and SNA devices, all with the same port number, just like today. However, if you specify a single device LCS for SNA, it would use a tap interface that was exclusively for its use, just like a two device LCS for IP does today.

This is the sole, and full member I'm using for my tests. I've only commented out the minor nodes 01..06. ... NODETYPE=NN,CONNTYPE=APPN ATCSTR00.

Thanks, I'll try later.

Ian

PoC-dev commented 3 years ago

Hello Ian,

Yes.

Okay, thanks for clarification.

Yes and no. If you want to use a single tap interface for both IP and SNA via LCS, you would have to use the OAT to define the IP and SNA devices, all with the same port number, just like today.

I see. That's no big deal for me.

However, if you specify a single device LCS for SNA, it would use a tap interface that was exclusively for its use, just like a two device LCS for IP does today.

Understood: Because IP needs two devices.

Thanks, I'll try later.

I'm looking forward to what you've found!

:wq! PoC

PoC-dev commented 3 years ago

Hello Vincent,

My first introduction to SNA was Communication Tools for AS/400 now I have started to learn about the zSeries VTAM system so it's a process

Good luck! See my answer to Ian above: VTAM is a huge beast to tame. I gave up to do this as a second task running with low priority. It's just too much and I need to dedicate a lot more energy into it.

I need to get OSA/SF working to activate my non-QDIO OSE adapters apparently.

I can't help you with that, unfortunately. But see my list of IBM documents above. There is a lot of helpful content regarding VTAM configuration.

But so far I've gotten SNA on OS/2 AS/400 Cisco SNAsw and Windows SNA server.

Congrats!

But yes it makes it easier when you have Hercules to work with SNA so I don't have to run my 2KW z114 system to test the routing.

Indeed! And I bet, the time needed between "start Hercules" and "system has finished all IPL related work" is a lot less than "switching on power" and "system has finished all IPL related work" on your z114. Even though the z114 is a lot more cool, in a 2 kW heating way. ;-)

Keep up the good work and if you need any information from my system please let me know I will try my best to provide all I can.

Thanks! Unfortunately, I can't see how a real z114 with an OSE might help. But then, Ian wrote that an OSE isn't too different to an LCS. So maybe this could be a way. Ian, your comment?

:wq! PoC

vinatron commented 3 years ago

Yes. I am just saying if you need any packet captures or anything of that nature.

Also, apparently, while reading an OSA/SF manual, a LCS is the name for the TCPIP operating mode and an LSA is the type for a SNA operating mode, which I found interesting, but they all use the same controller base type of OSE.

I'm working on transferring VTAM from VM/ESA to zVM and Dumping OS/390 for MVS VTAM to work with. I have VSE VTAM up already, but just haven't started defining SNA links yet.

mcisho commented 3 years ago

Hi Patrik, thanks, your definition works for me too. Ian

vinatron commented 3 years ago

I can also put my machine in service mode and collect service data about the operation of the adapters, if that can be of some use. I've found out a little bit on how to do operations like this poking around in the SE menus. They definitely take a while to come up. Faster than the OS/2 SE machines, but the Primary takes significantly longer than the alternate. I assume that's because the primary is spinning up more services.

PoC-dev commented 3 years ago

Hello Vincent,

I can also put my machine in service mode and collect service data about the operation of the adapters, if that can be of some use. I've found out a little bit on how to do operations like this poking around in the SE menus.

Thank you for your kind offer. Unfortunately, I'm not really skilled enough in Mainframe Terms to know how and if your possibilities could help in enhancing the LCS driver. Hopefully, the other devs maybe can shed some light?

:wq! PoC

vinatron commented 3 years ago

Possibly. I'm open to anything that may help in some way shape or from.

PoC-dev commented 3 years ago

Hello,

In case someone missed: Yesterday I reworked my findings post.

:wq! PoC

Fish-Git commented 3 years ago

In case someone missed: Yesterday I reworked my findings post.

Just out of curiosity, what "findings post" would that be?

PoC-dev commented 3 years ago

Abstract

These are the collected findings as Documented in the groups.io Mailing List H390-MVS at the end of January 2021 regarding Implementation of SNA support in Hercules-Hyperion to use 802.2 DLC on Ethernet via TAP device. They have been updated on Jan 30, 2021.

Consent between all devs so far was: There is no SNA support in Hercules. This wasn't sufficient for me.

Documentation for the LCS device hinted to partial SNA support, because the OAT file might contain a SNA directive. Looking at the ctc_lcs.c file revealed that a basic handling of SNA frames is already implemented. It is not known to which extent. This was my motivation to clarify this ambiguity.

Ian states:

QETH is OSD only. SNA would use an OSE(?), which is pretty much identical to LCS.

So I didn't bother to even look at QETH.

Testbed configuration includes

Thanks to (in no particular Order) Fish, Harold Grovesteen, and Ian Shorter for helping to shed some light on that issue.

Hercules configuration details

Hercules has been configured with a stock LCS device. Only one, because according to Ian, only TCP/IP support needs two devices: One for reading from and one to write into. E40 is documented as AWS3172, VTAM 3172 emulation in the original P390 configuration file, so I assume the HCD configuration of the guest OS is correct for that type of device also.

0E40 LCS --oat hercules.oat --debug

The OAT file provides a means to route packets to the available ports of the virtual 3172.

*********************************************************
* Dev   Mode  Port  Entry specific information          *
*********************************************************
  0E40  SNA   00
  HWADD 00 02:00:FE:DF:00:42

After the start of Hercules, a virtual device tap0 is automatically created. I bridge this device to an unused NIC in my test equipment:

ip link add br0 type bridge
ip link set dev eth0 up
ip link set dev eth0 master br0
bridge link set dev eth0 state 3
ip link set dev tap0 up
ip link set dev tap0 master br0
bridge link set dev tap0 state 3

Maybe the link state settings are not necessary, because they are handled by STP. STP is off by default.

Guest OS

Before trying anything, I applied the change regarding a missing interrupt handler as laid out in the New User's Cookbook accompanying the distribution.

MIH

It required changing the parmlib member IECIOS00:

MIH TIME=00:00,DEV=E40

I verified successfully that it works after the next IPL, with D IOS,MIH,DEV=E40:

0E40=00:00.

Note: I have added the same configuration change to the E20-E21 devices (CTCI), used solely for IP connectivity before I tried to get E40 running. So far, I've not observed adverse effects.

VTAM

VTAM configuration is mainly unchanged from the original file in sys1.local.vtamlist(xcae40e). The comment in that file states: 3172 (LSAPLAN) ETHERNET ADAPTER1 (E40). Definition:

XPE40E PORT CUADDR=E40,ADAPNO=1,SAPADDR=4,MEDIUM=CSMACD,DELAY=0,TIMER=30

Note: In the original member, this line is split to fit in one line.

The OAT documentation mentions ports starting with 0, while the VTAM configuration mentions ADAPNO=1 by default. It is not yet clear if this is an off-by-one glitch, or another error condition/inconsistency. Further testing might be necessary.

Findings

Trying to activate this major node in VTAM immediately raises an error in the guest OS console:

IST1023E START I/O TIMEOUT OCCURRED FOR CUA=0E40

The LCS debug output revealed, that VTAM sends a command frame which is not recognized by the command handler:

HHC00933D 0:0E40 CTC: executing command other (0x41)

I extended the code for ctc_lcs.c and the header file defining the commands, so command 0x41 is redirected to a TCP/IP Start LAN command handler, and 0x42 to the accompanying Stop LAN command handler. See attached diffs:

With help from Ian and Fish, I managed to get some debugging output, including a CCW trace (type t+0E40 into console, in addition to the already enabled debug mode of the LCS driver).

This is the output while IPL was progressing. Obviously some guest OS driver probing or even initializing the hardware:

HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F50E8E0
HHC01315I 0:0E40 CHAN: ccw 14300001 00000000
HHC01312I 0:0E40 CHAN: stat 0C00, count 0001
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 0001, ccw 1F50E8E8
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F50E8F0
HHC01315I 0:0E40 CHAN: ccw E4200100 1F50E8F8
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F9=>FF308860 30880100 00000000 00800100 ..h-.h..........
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 00F9, ccw 1F50E8F8
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:1F503DA0
HHC01315I 0:0E40 CHAN: ccw E4200100 1F503728
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F9=>FF308860 30880100 00000000 00800100 ..h-.h..........
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 00F9, ccw 1F503DA8
HHC01318I 0:0E40 CHAN: test I/O: cc=0

The activation of the XCAE40E resource yields the following output:

HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:0 LPM:80 Flags:08000 ....F....... ........ CCW:0AAF27F8
HHC01315I 0:0E40 CHAN: ccw 03300001 0AAF27F8
HHC01312I 0:0E40 CHAN: stat 0C00, count 0001
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00804007, stat 0C00, count 0001, ccw 0AAF2800
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01332I 0:0E40 CHAN: halt subchannel
HHC01300I 0:0E40 CHAN: halt subchannel: cc=0
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 00802001, stat 0C00, count 0001, ccw 0AAF2800
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702358
HHC01315I 0:0E40 CHAN: ccw C3200001 00000000
HHC01312I 0:0E40 CHAN: stat 0C00, count 0000
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60804007, stat 0C00, count 0000, ccw 0B702360
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702360
HHC01315I 0:0E40 CHAN: ccw E42000FF 0B702398
HHC01312I 0:0E40 CHAN: stat 0C00, count 00F8=>FF308860 308801E3 D3E2C3F6 C640F9F9 ..h-.h.TLSC6F 99
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60804007, stat 0C00, count 00F8, ccw 0B702368
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC01334I 0:0E40 CHAN: ORB: IntP:00F4AF50 Key:6 LPM:80 Flags:08000 ....F....... ........ CCW:0B702368
HHC01315I 0:0E40 CHAN: ccw 01200018 0B702498=>00160000 41000000 00000000 00000000 ................
HHC00981D 0:0E40 LCS: Accept data of size 24 bytes from guest
HHC00979D LCS: data: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
HHC00979D LCS: data: +0010< 00000000 00000000                    ........         ........        
HHC00922D 0:0E40 CTC: lcs command packet received
HHC00979D LCS: command: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
HHC00979D LCS: command: +0010< 00000000 0000                        ......           ......          
HHC00933D 0:0E40 CTC: executing command start lan (sna)
HHC02499I Hercules utility hercifc - Hercules Network Interface Configuration Program - version 4.3.0.0-SDL
HHC01414I (C) Copyright 1999-2020 by Roger Bowler, Jan Jaeger, and others
HHC01417I ** The SoftDevLabs version of Hercules **
HHC01415I Build date: Jan 26 2021 at 21:04:13
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
[Repeated endlessly]

After three minutes, a timeout value was reached for minor nodes of XCAE40E: STC00004 IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK,SENSE: 081C0000

This console message is accompanied by this debug output of Hercules:

HHC01332I 0:0E40 CHAN: halt subchannel
HHC01300I 0:0E40 CHAN: halt subchannel: cc=0
HHC00806I Processor CP00: I/O interrupt code 00010022 parm 00F4AF50 id 28000000
HHC01317I 0:0E40 CHAN: scsw 60806001, stat 0000, count 0000, ccw 0B702370
HHC01318I 0:0E40 CHAN: test I/O: cc=0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
HHC00978D CTC: lcs device port 00: STILL trying to enqueue REPLY frame to device 0E40 00000000 0.0.0.0
[Repeated endlessly]

The inactivation of the resource yields no more channel communication to the LCS.

There never was one single frame visible in a tcpdump running on the associated tap interface.

Forcing an AS/400 to send XID frames to the MAC of the LCS unsurprisingly yielded no reaction, neither from Hercules Tracing, nor from VTAM.

Summary

Fish-Git commented 3 years ago

Abstract

These are the collected findings as Documented in ...

Ah yes! I remember seeing that post! I wonder what happened to it? Did you delete it? In any case, I guess it's not that important anymore, since you've now posted a more current version of it. Thank you for that.

Fish-Git commented 3 years ago
  • OS/390 V2R10 ADCD, being freely available from archive.org.

Question: for documentation purposes and for the benefit of others who might be interested, do you have the direct URL available? Thanks.

Fish-Git commented 3 years ago

Ian (no last name known)

Shorter. His name is Ian Shorter.

wrljet commented 3 years ago

OS/390 V2R10 ADCD, being freely available from archive.org.

PoC-dev commented 3 years ago

Hello Fish,

Ah yes! I remember seeing that post! I wonder what happened to it? Did you delete it?

No, it's just hidden.

In any case, I guess it's not that important anymore, since you've now posted a more current version of it. Thank you for that.

You're welcome!

:wq! PoC

PoC-dev commented 3 years ago

Hello Fish,

Shorter. His name is Ian Shorter.

Thanks. Corrected.

:wq! PoC

PoC-dev commented 3 years ago

Dear Devs,

Does one of you have the expertise to guide Vincent "@vinatron" through enabling a CCW trace on real hardware?

With his OSA card in non-QDIO-Mode and with tracing active while he's activating a XCA major node in VTAM, if we then see the command 41 show up, we'd achieve a huge step forward, by proving that the channel communication words are most likely the same for OSA-non-QDIO, and LCS.

Opinions, anybody?

:wq! PoC

Fish-Git commented 3 years ago

Does one of you have the expertise to guide Vincent "@vinatron" through enabling a CCW trace on real hardware?

That certainly sounds like an excellent idea, Patrick!

Unfortunately I myself have zero experience with modern mainframe hardware. The last mainframes I had hands-on experience with with System/370's and 4300's!

Hopefully someone else who does have hands-on experience will read this and provide some instruction to help us out. Or maybe this question should be asked on the main Hercules list/forum? Maybe someone there might know? (I'm not sure how many members read GitHub Issues so we might have better luck asking the question to a wider audience.)

PoC-dev commented 3 years ago

Hello Vincent,

Apparently GTF is what we need to utilize. Do you want to read and learn yourself, or do you require a ready-made action plan? For the latter, I'd also need to read about it. Do not hesitate to ask questions, if they arise.

@Fish-Git: IBM provides a lot of documentation. It's most often the question what to search for.  :-)

:wq! PoC

vinatron commented 3 years ago

Thanks. I can try and define them, but I may need some guidance, so a plan of procedure would probably be helpful if there's one available.

Thanks for your assistance with this project. I hopefully plan to have OSA/SF setup so I can configure my non-QDIO adapters.

Keep up the good work!

Kind regards, Vincent

PoC-dev commented 3 years ago

Hello Vincent,

Thanks I can try and define them but I may need some guidance so a plan of procedure would probably be helpful if there's one available.

Two ideas:

Thanks for your assistance with this project I hopefully plan to have OSA/SF setup so I can configure my non-QDIO adapters.

Thanks. Unfortunately, I can't help you with that yet. VTAM isn't exactly something I'm fluent in.  :-)

Keep up the good work!

Thanks! Same to you!

:wq! PoC

vinatron commented 3 years ago

VTAM isn't exactly something I'm fluent in.

Me either. OSA/SF is for enabling the links in either TCPIP or SNA operating modes. I'm trying to find a good manual explaining the process of defining them so I can start using them.

wably commented 3 years ago

You do need to use GTF. Unfortunately, I can't recall all of the exact detail because its been about ten years since I last did any VTAM traces and so I am trying to remember all of this from memory. But basically, the process is something like this:

 START GTF           (an operator command)

After GTF starts, it will prompt you at the console for the kind of trace you want and what types of records you wish to trace. As best I can recall, VTAM traces needed GTF to record USR trace entries.

After GTF is satisfied, then you start the actual VTAM trace itself, another operator command:

 F NET,TRACE,TYPE=type,ID=node_name

Where, 'type' is BUF for a buffer trace of the data stream, or you might use type IO for an i/o trace or CNM for a network management trace. For your issue described, either one of these might be beneficial but some more than others.

ID=node_name is the name of the VTAM entity you wish to trace, for example the node or controller or line you are trying to activate as described in this Github issue above.

Once the F NET trace is started, then you would issue VARY NET to activate your node. The trace data should be recorded to GTF. Once you are satisfied that you have recreated your situation, stop the traces:

 F NET,NOTRACE,TYPE=buf/io/cnm,ID=node_name
 P GTF

The trace data should be in your GTF trace dataset. In order to get it out of there, you need to format it. It used to be done with a service aid utility like AMDPRDMP or something like that but I think now you have to use IPCS to view the trace in a formatted style.

Before starting GTF, you'll also need to review the GTF started task procedure to ensure the trace datasets are defined and that you can access them. It's rather an ordeal to get all of this set up for the first time, but once done, it is easy to just issue the commands to start and stop the traces as needed and review the results, repeatedly.

In order to find the data you are looking for, you'll need a manual which I believe was called SNA Formats. It should have the layout and description of the data streams that VTAM will send and receive from your node. I believe that they were called PIUs.

I wish I could be more specific. I used to keep notes on all of this stuff so I could use it when needed but after I retired I don't have access to these notes any longer. There is also a manual called z/OS MVS Diagnosis Tools and Service Aids. In there is a whole chapter on GTF and how to set it up, answer the prompts, and how to view the trace. Also, for the format of the VTAM F NET,TRACE command, look in a manual called *VTAM Operation. ()**

While none of this immediately gets you going, hopefully there is enough here to get you pointed in the right direction where you can start tracking down how to do it and get what you need.

Regards, Bob


*`()`** EDITOR'S NOTE:

I was unable to locate any "VTAM Operation" manual, but I did find the following two manuals which may be helpful?

I took a quick peek at the "SNA Operation" manual and the entire(?) manual appears to be focused on VTAM operation, so maybe that's the manual you were thinking of?

Fish

wably commented 3 years ago

Yes, SNA Operation is the right one. It did used to be called VTAM Operation in the older ACF/VTAM days, before the name was changed to CS (Communication Server). In any case, this manual is the one that has all of the VTAM operator commands within, so that you can decide what trace parameters you wish to invoke.

Regards, Bob

Fish-Git commented 3 years ago

FYI:   Links have been updated.

Fish-Git commented 3 years ago

FYI:   Link for the "SNA Formats" manual has been reverted to its previous URL.

(Apparently ALL of the links on IBM's "Publication Information" web site are broken! SIGH!!)

Fish-Git commented 3 years ago

FYI:   Link for the "SNA Formats" manual has been reverted to its previous URL.

(Apparently ALL of the links on IBM's "Publication Information" web site are broken! SIGH!!)

But since I happen to have a copy of the -20 version of the manual (I must have downloaded it before IBM broke their Publications web site), I am attaching it here for your convenience:

Hope that helps!

(and sorry for the sudden flurry of posts and for the SNA Formats URL snafu)

mcisho commented 3 years ago

An update to report some, very small, progress. I've revamped the LCS code and VTAM reads the reply to the SNA Start LAN command. However that's all that apparently happens, other than VTAM reporting an I/O timeout. I'm wondering if the timeout message is the result of the reply we provide, perhaps because the reply doesn't contain some expected information, which causes VTAM to ignore the reply. Unfortunately, as we don't know what VTAM might or might not be expecting, that's just supposition.

PoC-dev commented 3 years ago

Hello Ian,

An update to report some, very small, progress.

Thanks for the revamp and Insight!

I'm wondering if the timeout message is the result of the reply we provide, perhaps because the reply doesn't contain some expected information, which causes VTAM to ignore the reply. Unfortunately, as we don't know what VTAM might or might not be expecting, that's just supposition.

Unfortunately we can't tell without Vincent collecting I/O trace data on his real hardware, and my assumption (IBM not reinventing the wheel but using the existing LCS SNA channel commands for OSA) proves true.

:wq! PoC

PoC-dev commented 3 years ago

Hello Vincent,

Me either. OSA/SF is for enabling the links in either TCPIP or SNA operating modes. I'm trying to find a good manual explaining the process of defining them so I can start using them.

Any progress in here? I don't want to push, just curiosity.

:wq! PoC

vinatron commented 3 years ago

Haven't figured out OSA/SF yet. However, I did manage to setup internal routing via CTC devices. Unfortunately though, I've been having some issues with my ESA1PK volume that I need to fix first. It's going along though once I fix it.

mcisho commented 3 years ago

An update to report some progress. I've worked out what input VTAM is checking for, and have provided some acceptable values (though I don't know what some of the values are!) and VTAM is now happy with the replies to the SNA Start LAN command, and the SNA LAN Statistics command.

However, after that things go downhill!

The following two snippets are from the OS/390 console and the Hercules log:

18.48.00           v net,act,id=xcae43e
18.48.00 STC00108  IST097I VARY ACCEPTED
18.48.01 STC00108  IST093I XCAE43E ACTIVE
18.48.02 STC00108  IOS000I 0E43,**,CPC,17,0020,,,,VTAM
18.48.02 STC00108  IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK,  SENSE: 081C0000
20:48:02 HHC01334I 0:1D56 CHAN: ORB: IntP:00AAC3B0 Key:6 LPM:80 Flags:0C300 ....FP....HT ........ CCW:7CAD0998
20:48:02 HHC01315I 0:1D56 CHAN: ccw 17600001 8FFFFFF1
20:48:02 HHC01312I 0:1D56 CHAN: stat 0020, count 0000
20:48:02 HHC00806I Processor CP00: I/O interrupt code 00010037 parm 00AAC3B0 id 00000000
20:48:02 HHC01317I 0:1D56 CHAN: scsw 60C04017, stat 0020, count 0000, ccw 7CAD09A0

I'm running in a virtual machine, and VM's channel program translation causes the I/O error. I believe I need to declare the Control CCW as an immediate command, which should avoid this error. However, after that comes the hard part, because I don't have anything that does SNA.

Does anyone have any suggestions on how to setup a testing environment?   (Note: I use Fedora Linux.)

PoC-dev commented 3 years ago

Hello,

An update to report some progress. I've worked out what input VTAM is checking for, and have provided some acceptable values (though I don't know what some of the values are!) and VTAM is now happy with the replies to the SNA Start LAN command, and the SNA LAN Statistics command.

That's great news!

Does anyone have any suggestions on how to setup a testing environment?   (Note: I use Fedora Linux.)

I see multiple approaches.

While I have the machinery to actually provide a testbed for the Cisco- and AS/400 approaches — including remote access for testers, if desired — I (still) lack the fundamental understanding how to properly configure VTAM to create a fully working APPN+APPC connection with accompanying SNASVCMG sessions for name resolution. But maybe this isn't needed for initial testing?

If you can give me directions how I can access the latest Hyperion with your changes already applied, I'm happy to test and provide status and LCS debug output. Possibly more if I know what exactly you need. (I'm still not used to using Git, so a little help what to do to to have the sources locally is highly appreciated. I still didn't get a solid grip to the handling of code tree branches.)

mcisho commented 3 years ago

But maybe this isn't needed for initial testing?

I suspect there is a long way to go before name resolution becomes a problem!

If you can give me directions how I can access the latest Hyperion with your changes ...

You can clone my work in progress from https://github.com/mcisho/hyperion-lcs-sna.git

Possibly more if I know what exactly you need.

At present I'm just trying to understand what the messages coming from VTAM mean, and how to fool VTAM into being happy with the responses. The latest message I'm trying to understand, and which currently has me completely baffled, is:-

HHC00979D LCS: data: +0000< 00160000 00000000 00140400 000C0C99  ................ ...............r
HHC00979D LCS: data: +0010< 0003C000 00000000 01000000 0000      ..............   ..{...........  

I hope a real LCS would understand!

What Host system do you use?

PoC-dev commented 3 years ago

I suspect there is a long way to go before name resolution becomes a problem!

Allow me to rephrase. Because of my lack of understanding about VTAM, I can't tell if a problem is because of my own inability, or Hercules.

You can clone my work in progress from:   https://github.com/mcisho/hyperion-lcs-sna.git

Thanks. I'll have a look over the easter holidays and get back to you.

At present I'm just trying to understand what the messages coming from VTAM mean, and how to fool VTAM into being happy with the responses.

I'm afraid, I can't help with that. Guessing a protocol is most hard.

What Host system do you use?

Current Debian Linux. Since the tun/tap support is in the Kernel, it shouldn't matter which distro I use.

vinatron commented 3 years ago

One more note: a newer AS/400 can speak SNA if you have "Communication Tools" package installed. Newer machines can't do Dialup or SDLC though. They can however still do SNA APPC/APPN over Ethernet.

I have 2 ways to connect my AS/400: either with one of my Ethernet interfaces, or by connecting my Modem interface to my Cisco 2811 and have them form a PPP connection for SNASw.

I finally have an ESCON adapter now too, so I can run some backups of my system and hopefully proceed.

I still haven't figured out OSA/SF. I have however developed an alternative method in case I can't get OSA/SF working.

PoC-dev commented 3 years ago

I feel I shall explain some statements.

One more note: a newer AS/400 can speak SNA if you have "Communication Tools" package installed.

Which is most often the case, because AFAIK it's a no-charge feature.

Newer machines can't do Dialup or SDLC though.

Because they lack proper hardware support. I'm not sure if support was removed in the OS, but if you add an IOP card (dedicated I/O Processor on a PCI card, claiming PCI slots nearby) and sync serial cards, I guess it's still working.

They can however still do SNA APPC/APPN over Ethernet.

Only on IOAs (I/O Adapters) with max 100M (Ethernet) and an IOP serving that particular adapter. Gigabit Ethernet is an IOPless feature (the CPU effectively shovels the data). With that your only option is to use Enterprise Extender (SNA over UDP).

I have 2 ways to connect my AS/400: either with one of my Ethernet interfaces, or by connecting my Modem interface to my Cisco 2811 and have them form a PPP connection for SNASw.

Does SNA run over PPP? I remember to have found an RFC for that, but normally serial connections run SDLC with SNA on top.

I still haven't figured out OSA/SF. I have however developed an alternative method in case I can't get OSA/SF working.

Good luck!

PoC-dev commented 3 years ago

You can clone my work in progress from:   https://github.com/mcisho/hyperion-lcs-sna.git

Worked, running right now.

At present I'm just trying to understand what the messages coming from VTAM mean, and how to fool VTAM into being happy with the responses. The latest message I'm trying to understand, and which currently has me completely baffled, is:-

HHC00979D LCS: data: +0000< 00160000 00000000 00140400 000C0C99  ................ ...............r
HHC00979D LCS: data: +0010< 0003C000 00000000 01000000 0000      ..............   ..{...........  

I hope a real LCS would understand!

I don't understand your dump, because I'm not really fluent with such low-level output.

Test setup

Activating XCAE40E

When I activate the XCAE40E, debug-output shows:

16:03:06 HHC00933D 0:0E40 CTC: executing command start lan sna
16:03:06 HHC00923D 0:0E40 CTC: lcs command reply enqueue
16:03:06 HHC00979D LCS: reply: +0000> 00000000 41800000 00000000 50000000  ....A.......P... ............&...
16:03:06 HHC00979D LCS: reply: +0010> 00000000 00000800                    ........         ........
16:03:06 HHC00979D LCS: LCSATTN in: +0000  00000000 00000000 A022C5A4 1E560000  ........."...V.. ..........Eu....
16:03:06 HHC00979D LCS: LCSATTN out: +0000  00000000 00000000 A022C5A4 1E560000  ........."...V.. ..........Eu....
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  0
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  1
16:03:06 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  2
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  3
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=0  4
16:03:06 HHC03992D 0:0E40 LCS: Code 02: Flags 20: Count 000000FF: Chained 00: PrevCode 00: CCWseq 0
16:03:06 HHC00982D 0:0E40 LCS: Present data of size 26 bytes to guest
16:03:06 HHC00979D LCS: data: +0000> 00180000 41800000 00000000 50000000  ....A.......P... ............&...
16:03:06 HHC00979D LCS: data: +0010> 00000000 00000800 0000               ..........       ..........
16:03:06 HHC03993D 0:0E40 LCS: Status 0C: Residual 000000E5: More 00
16:03:06 HHC03992D 0:0E40 LCS: Code 01: Flags 20: Count 00000016: Chained 00: PrevCode 00: CCWseq 0
16:03:06 HHC00981D 0:0E40 LCS: Accept data of size 22 bytes from guest
16:03:06 HHC00979D LCS: data: +0000< 00140000 44000001 00000000 00000000  ....D........... ................
16:03:06 HHC00979D LCS: data: +0010< 00000000 0000                        ......           ......
16:03:06 HHC00922D 0:0E40 CTC: lcs command packet received
16:03:06 HHC00979D LCS: command: +0000< 00140000 44000001 00000000 00000000  ....D........... ................
16:03:06 HHC00979D LCS: command: +0010< 00000000                             ....             ....
16:03:06 HHC00933D 0:0E40 CTC: executing command lan statistics sna
16:03:06 HHC00942I CTC: lcs device tap0 using mac 02:36:CE:CA:DE:09
16:03:06 HHC00923D 0:0E40 CTC: lcs command reply enqueue
16:03:06 HHC00979D LCS: reply: +0000> 00000000 44800001 00000000 01040000  ....D........... ................
16:03:06 HHC00979D LCS: reply: +0010> 00000602 36CECADE 0A00               ....6.....       ..........
16:03:06 HHC00979D LCS: LCSATTN in: +0000  00000000 00000000 A022C5A4 1E560000  ........."...V.. ..........Eu....
16:03:06 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
16:03:06 HHC00979D LCS: LCSATTN out: +0000  00000000 00000000 A022C5A4 1E560000  ........."...V.. ..........Eu....
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  0
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  1
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=1  2
16:03:06 HHC03991D 0:0E40 LCS: device_attention rc=0  3
16:03:06 HHC03992D 0:0E40 LCS: Code 02: Flags 20: Count 000000FF: Chained 00: PrevCode 00: CCWseq 0
16:03:06 HHC00982D 0:0E40 LCS: Present data of size 28 bytes to guest
16:03:06 HHC00979D LCS: data: +0000> 001A0000 44800001 00000000 01040000  ....D........... ................
16:03:06 HHC00979D LCS: data: +0010> 00000602 36CECADE 0A000000           ....6.......     ............
16:03:06 HHC03993D 0:0E40 LCS: Status 0C: Residual 000000E3: More 00

In theory, this should match what you see when you activate your "line".

VTAM throws an error IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK, SENSE: 081C0038, and the only occurrence of this ID is found here. This tints towards a configuration issue within the P390 system or VTAM. Or it could be related to the fact that we're still experimenting and guessing.

Packets from AS/400 on tap0

I can see with tcpdump -i tap0 -n, that packets are arriving:

16:39:36.129395 00:20:35:b5:41:64 > 02:00:fe:df:00:42 SNA Unnumbered, xid, Flags [Poll], length 46
    0x0000:  0404 bf00 0021 0000 0000 0000 0000 0000  .....!..........
    0x0010:  0041 0000 0000 0000 0000 0000 0000 00cc  .A..............
    0x0020:  52d0 0000 002c e0cc 5001 00cc 5314       R....,..P...S.
16:39:43.351687 00:20:35:b5:41:64 > 02:00:fe:df:00:42 SNA Unnumbered, xid, Flags [Poll], length 46
    0x0000:  0404 bf00 0001 0000 0000 0005 0000 0096  ................
    0x0010:  2140 0000 000c 36c8 0000 0000 0000 00c9  !@....6.........
    0x0020:  fb10 0000 002c e0c9 f001 00c9 fb54       .....,.......T
16:39:50.536730 00:20:35:b5:41:64 > 02:00:fe:df:00:42 SNA Unnumbered, xid, Flags [Poll], length 46
    0x0000:  0404 bf00 0000 0000 0000 0096 33a0 0000  ............3...
    0x0010:  005c d3c3 c7c5 0000 0000 0000 0000 0096  .\..............
    0x0020:  3380 0000 002c e096 3001 0096 33c4       3....,..0...3.

at the same time, I see in the Hercules' debug output:

16:39:36 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 60 bytes (with unknown packet) from device tap0
16:39:36 HHC00979D LCS: eth frame: +0000> 0200FEDF 00420020 35B54164 00030404  .....B. 5.Ad.... ................
16:39:36 HHC00979D LCS: eth frame: +0010> BF000021 00000000 00000000 00000041  ...!...........A ................
16:39:36 HHC00979D LCS: eth frame: +0020> 00000000 00000000 00000000 00CC52D0  ..............R. ...............}
16:39:36 HHC00979D LCS: eth frame: +0030> 0000002C E0CC5001 00CC5314           ...,..P...S.     ....\.&.....    
16:39:36 HHC00951D CTC: lcs device port 00: no match found, discarding frame
16:39:43 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 60 bytes (with unknown packet) from device tap0
16:39:43 HHC00979D LCS: eth frame: +0000> 0200FEDF 00420020 35B54164 00030404  .....B. 5.Ad.... ................
16:39:43 HHC00979D LCS: eth frame: +0010> BF000001 00000000 00050000 00962140  ..............!@ .............o. 
16:39:43 HHC00979D LCS: eth frame: +0020> 0000000C 36C80000 00000000 00C9FB10  ....6........... .....H.......I..
16:39:43 HHC00979D LCS: eth frame: +0030> 0000002C E0C9F001 00C9FB54           ...,.......T     ....\I0..I..    
16:39:43 HHC00951D CTC: lcs device port 00: no match found, discarding frame
16:39:50 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 60 bytes (with unknown packet) from device tap0
16:39:50 HHC00979D LCS: eth frame: +0000> 0200FEDF 00420020 35B54164 00030404  .....B. 5.Ad.... ................
16:39:50 HHC00979D LCS: eth frame: +0010> BF000000 00000000 009633A0 0000005C  ..........3....\ .........o.....*
16:39:50 HHC00979D LCS: eth frame: +0020> D3C3C7C5 00000000 00000000 00963380  ..............3. LCGE.........o..
16:39:50 HHC00979D LCS: eth frame: +0030> 0000002C E0963001 009633C4           ...,..0...3.     ....\o...o.D    
16:39:50 HHC00951D CTC: lcs device port 00: no match found, discarding frame

So, why does the LCS state "no match"?

HTH!

mcisho commented 3 years ago

Last question first:

16:39:36 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 60 bytes (with unknown packet) from device tap0
16:39:36 HHC00979D LCS: eth frame: +0000> 0200FEDF 00420020 35B54164 00030404  .....B. 5.Ad.... ................
16:39:36 HHC00979D LCS: eth frame: +0010> BF000021 00000000 00000000 00000041  ...!...........A ................
16:39:36 HHC00979D LCS: eth frame: +0020> 00000000 00000000 00000000 00CC52D0  ..............R. ...............}
16:39:36 HHC00979D LCS: eth frame: +0030> 0000002C E0CC5001 00CC5314           ...,..P...S.     ....\.&.....    
16:39:36 HHC00951D CTC: lcs device port 00: no match found, discarding frame

The first six bytes are the destination MAC address, the second six bytes are the source MAC address, the next two bytes are the ethertype field, and the remaining bytes are the payload. To quote Wikipedia "The EtherType field is two octets long and it can be used for two different purposes. Values of 1500 and below mean that it is used to indicate the size of the payload in octets, while values of 1536 and above indicate that it is used as an EtherType, to indicate which protocol is encapsulated in the payload of the frame. When used as EtherType, the length of the frame is determined by the location of the interpacket gap and valid frame check sequence (FCS). " As you can see the ethertype value is 0x0003, so should be indicating the payload length, which it obviously isn't! (The protocol value for SNA ethernet frames id 0x80D5.) Because LCS cannot determine what the frame type is, it discards it. Is the AS/400 actually sending frames with the ethertype equal to 0x0003? Or is it being changed somewhere in the receiving Linux?

Secondly:-

16:03:06 HHC00942I CTC: lcs device tap0 using mac 02:36:CE:CA:DE:09
16:03:06 HHC00982D 0:0E40 LCS: Present data of size 28 bytes to guest
16:03:06 HHC00979D LCS: data: +0000> 001A0000 44800001 00000000 01040000  ....D........... ................
16:03:06 HHC00979D LCS: data: +0010> 00000602 36CECADE 0A000000           ....6.......     ............

The data presented to the guest contains the MAC address of the guest's view of the tap. LCS issues an ioctl request to the host to obtain the MAC address of the hosts view of the tap (which should be the MAC address specified in the Hercules config), adds one to the returned value and passes that value to the guest. However, as you can see the tap appears to be using a completely different MAC address. Is it the bridge's MAC address? Does the tap get created with the Herc config MAC, which then gets changed as a result of being bridged?

Finally, the IP and SNA Codes manual says about sense 081C0038:

If this sense code is issued as the result of the activation of a 3172 XCA major node, verify that the
ADAPNO parameter on the PORT definition statement matches the adapter number assigned by the
IBM 3172 communication controller.

Your Hercules config specifies port (i.e. adapno) zero.

PoC-dev commented 3 years ago

As you can see the ethertype value is 0x0003, so should be indicating the payload length, which it obviously isn't! (The protocol value for SNA ethernet frames id 0x80D5.) Because LCS cannot determine what the frame type is, it discards it.

Interesting find!

Is the AS/400 actually sending frames with the ethertype equal to 0x0003?

Apparently. Or there's something else going on.

When creating a line description (configuration object) for an Ethernet IOA (I/O Adapter), I can choose between Ethernet-Version 2, and IEEE 802.3 Standard. Default is both, because ETHv2 is needed for IP on the same IOA.

If the value is both, the system automatically generates Source Service Access Points (SSAPs), and the listed SSAP-Type:

I know no other means to influence packet generation. The AA value hints to SNAP encapsulation, but since it's NONSNA, it's not used for SNA.

However, I can connect to other AS/400's on the LAN, and I can connect from MS SNA Server to the AS/400. Both works.

Or is it being changed somewhere in the receiving Linux?

I highly doubt it, because I see no reason for that to happen. That would be a first for me. But then, I don't have too much very low-level networking experience with Linux.

The data presented to the guest contains the MAC address of the guest's view of the tap. LCS issues an ioctl request to the host to obtain the MAC address of the hosts view of the tap (which should be the MAC address specified in the Hercules config), adds one to the returned value and passes that value to the guest. However, as you can see the tap appears to be using a completely different MAC address.

Yes. I already wondered about this, but I paid no closer attention yet. Since Ethernet-Switches also have their own MAC addresses on the ports, and those Switches are just multiport-bridges, I didn't bother about that.

Is it the bridges MAC address?

Yes. Apparently, the MAC is inherited from eth0, the NIC solely dedicated to the bridging stuff (for easier debugging). Interestingly, the message above shows a completely different MAC. But see below.

Does the tap get created with the Herc config MAC, which then gets changed as a result of being bridged?

No. The tap is being created with be:2b:9e:43:ae:d6, which is not the MAC from hercules.oat. After adding tap0 to the bridge, the MAC is not changed either.

The lcs is configured in Hercules as follows: 0E40 LCS --oat hercules.oat --debug

From here I learned:

If this option is specified, the optional --mac and guestip entries are ignored in preference to statements in the OAT.

I replaced the MAC address in hercules.oat with the MAC from the bridge: HWADD 00 00:10:18:90:B8:3A

After starting Hercules, I get the following log output:

16:10:37 HHC00955E CTC: invalid MAC 00:10:18:90:B8:3A in statement HWADD in file hercules.oat: HWADD       00    00:10:18:90:B8:3A
16:10:37 HHC00007I Previous message from function 'BuildOAT' at ctc_lcs.c(3550)
16:10:37 HHC01463E 0:0E40 device initialization failed
16:10:37 HHC00007I Previous message from function 'attach_device' at config.c(1331)

This is to be expected. From here I learned:

Note: The MAC address you specify for this option MUST have the 02 locally assigned MAC bit on in the first byte, must NOT have the 01 broadcast bit on in the first byte, and MUST be unique as seen by all other devices on your network segment. It should never, for example, be the same as the host adapter MAC address specified on the -n parameter.

Apparently, the MAC generated by Hercules for tap0 is random and not derived from the OAT. So I set it manually with ifconfig tap0 hw ether 02:00:FE:DF:00:42. After adding tap0 to the bridge, and IPL of the OS, the MAC does not change. But if I change the tap0's MAC, tcpdump does not show traffic from the LAN port to 02:00:FE:DF:00:42 anymore, while varying on the controller description ("PU") for 02:00:FE:DF:00:42 on the AS/400. I do see other traffic from other LAN hosts, though. This is very mysterious and could hint to an in-kernel problem with changing MAC addresses from tap interfaces.

Now, I also can't set the MAC from within hercules, because OAT and MAC statements are mutually exclusive, and for SNA, an OAT is mandatory, as I remember from earlier discussions.

Apparently, I've reached a dead end. Any suggestions?

Finally, the IP and SNA Codes manual says about sense 081C0038:-

If this sense code is issued as the result of the activation of a 3172 XCA major node, verify that the
ADAPNO parameter on the PORT definition statement matches the adapter number assigned by the
IBM 3172 communication controller.

Your Hercules config specifies port (i.e. adapno) zero.

Correct, that was a mismatch. I changed this in the VTAM configuration: ADAPNO=0. New sense code after activation: 16.50.18 STC00003 IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK, SENSE: 081C0000. Google did not pop up something particular helpful. As said, I'm not sure if this is a problem from the LCS code. For VTAM, every PDS member is a major node, and there are PUs defined by default. I can't get my head around why PUs are defined on the line level. Maybe my AS/400 skills get into my way. There's a clear distinction between lines, SNA controllers (PU), and SNA devices (LU). On top of that come the modes, defining session parameters.