SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
240 stars 90 forks source link

Request for Native SNA support #348

Closed vinatron closed 2 years ago

vinatron commented 3 years ago

Hello,

I have been a community member for a few years and now have some hardware of my own. This really isn't an issue, however I would like to have SNA support in Hercules to be able to test SNA typologies without having to keep my frame online.

I am interested in assisting with adding support in any way I can because the original system had an OSA adapter defined in OSC mode for SNA console access. If I can help contribute any data to help with the addition of SNA into Hercules, please let me know.

I don't however have a communications controller. If I did, I would give data on those as well to have Native SNA on the 370 as well. My end goal is to have a Cisco and IBM SNA network going between a client node and my AS/400 and Mainframe systems, and use Communication Tools MVS/VM bridge on the AS/400 to setup NJE connections over SNA because that's how the manual explains the connection process.

Thanks for the continued development on the software. It is greatly appreciated. Also, any additional data you may need, my machine isn't the newest, however I can give any data that you may need. It's a z114-M05 with 2 CPs and 8GB of storage "RAM". I also have plans to try and connect my ESCON tape device and see if I can transfer data to real tapes.

Kind Rearguards, Vincent

mcisho commented 3 years ago

LCS SNA no longer requires an oat. Comment the LCS statement using the oat, and add the following LCS statement to the config:

0E40  LCS  -e SNA -m macaddress -d

The oat you posted yesterday should have set the MAC address of the tap, that is what the HWADD line is for. Were there any indications that setting the MAC failed? Hercules should have set the MAC address. If the config line above doesn't work we'll try a preconfigured interface.

I'd love to have a trace of the SNA ethernet frames arriving from the AS/400 that I could peruse with Wireshark at my leisure. Could you re-run your test with tcpdump writing an output file, and attach the zipped file here?

tcpdump -ni tap0 -s0 -w ./sna.pcap

Thanks.

vinatron commented 3 years ago

You need someone to form an SNA connection with an AS/400 and Hercules? I could probably do it soon with zOS on a QDIO adapter and try the Hercules driver because I still can't get my non-QDIO adapter working.

PoC-dev commented 3 years ago

LCS SNA no longer requires an oat. Comment the LCS statement using the oat, and add the following LCS statement to the config:

0E40  LCS  -e SNA -m macaddress -d

This does not yield the MAC-Address I've configured. My entry: 0E40 LCS -e SNA -m 02:00:FE:DF:00:42 -d

Output of ifconfig tap0:

tap0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 72:d4:61:dc:f3:a3  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I did a git pull and a recompile/reinstall, since I saw changes. Still, every launch of Hercules yields a random MAC. Output from the log:

15:58:51 HHC00901I 0:0E40 LCS: Interface tap0, type TAP opened
15:58:51 HHC00921I CTC: lcs device port 00: manual Multicast assist enabled
15:58:51 HHC00935I CTC: lcs device port 00: manual Checksum Offload enabled
15:58:51 HHC00967D CTC: lcs device port 00: read thread: waiting for start event
15:58:51 HHC03984D LCS_AttnThread activated

After you wrote which repo to clone, I ran: git clone https://github.com/mcisho/hyperion-lcs-sna.git. Next, the usual thing:

./configure --enable-ipv6  --enable-cckd-bzip2 --enable-het-bzip2 --enable-optimization=yes
make -j2
make install

(It's a dual-core machine.) Apparently, I'm using the correct binary. From the log:

15:58:51 HHC01415I Build date: Apr  8 2021 at 15:51:01
15:58:51 HHC01417I Built with: GCC 8.3.0
15:58:51 HHC01417I Build type: GNU/Linux x86_64 host architecture build
15:58:51 HHC01417I Modes: S/370 ESA/390 z/Arch

Still, every Hercules start yields a new, seemingly random MAC.

The oat you posted yesterday should have set the MAC address of the tap, that is what the HWADD line is for.

Obviously, that did not work.   ;-)

Were there any indications that setting the MAC failed?

No.

Hercules should have set the MAC address.

That's also my understanding, but apparently, it did not. Eventhough it's running as root.

If the config line above doesn't work we'll try a preconfigured interface.

I tried that months before. Maybe I understood configuration statements wrong, but Hercules was ignoring the existing interface and always created a new one. Of course I can't recall what I did, but I remember that there was no way to actually name the interface to use; but only where to find the generic /dev/net/tun. If you can enlighten me on this, I'd be glad!

I'd love to have a trace of the SNA ethernet frames arriving from the AS/400 that I could peruse with Wireshark at my leisure. Could you re-run your test with tcpdump writing an output file, and attach the zipped file here? […] Thanks!

You're most welcome! See attachment. I hope this helps!

PoC-dev commented 3 years ago

You need someone to form an SNA connection with an AS/400 and Hercules? I could probably do it soon with zOS on a QDIO adapter and try the Hercules driver because I still can't get my non-QDIO adapter working.

Which kind of "SNA connection" are we talking about? 802.2 DLC, or Enterprise Extender, SNA over TCP/UDP?

As far as I've understood, QDIO only supports IP. That's why I asked you if you could maybe find out how to use non-QDIO-Mode to be able to talk SNA over 802.2 DLC. I guess and hope that the CCWs for SNA for non-QDIO and LCS in SNA mode are the same. Even IBM rarely reinvents the wheel.

Meanwhile, Ian decided to move on, trying to guess what it takes to make VTAM happy. I again took a twist to see what happens with Hercules actually getting frames to digest. Right now we're still investigating some glitches preventing Hercules to recognize these frames. Once the LCS code actually has to handle these, we'll see what happens.

Still, a CCW trace of a (real!) non-QDIO adapter being successfully activated might greatly help to understand what VTAM expects to see.

mcisho commented 3 years ago

OK, discovered why the MAC address wasn't being set, it is done in the processing of an LCS command that isn't used by SNA. The latest commit corrects that problem, and also corrects the discarding of the frames arriving from the AS/400. However, as the XCA is still not activating successfully, nothing will happen with the arriving frames!

I should have mentioned that using an LCS config command with the -e SNA option means you can only use adapter number zero with that LCS. If you want to use a non-zero adapter number you have to define the LCS using an oat.

Do you see the data from the guest that has me baffled (see comment from nine days ago)? It should be traced as 'executing command other (0x00)'.

PoC-dev commented 3 years ago

OK, discovered why the MAC address wasn't being set, it is done in the processing of an LCS command that isn't used by SNA. The latest commit corrects that problem, and also corrects the discarding of the frames arriving from the AS/400.

Hm. Does not work for me. Did a git pull & co. as described above.

01:31:14 HHC01415I Build date: Apr  8 2021 at 21:25:14

MAC of tap0 still appears random.

However, as the XCA is still not activating successfully, nothing will happen with the arriving frames!

Your idea (and solutions) to guess how to make VTAM happy was brilliant! But, again we're stuck, waiting for a chance to get help from someone with real hardware, and expertise.

I know that the P390 environment also relies on an emulated LCS. And I know someone who owns a machine with a P390. But I don't know if it's complete, or working at all. I'll ask the owner. Also, maybe it could be worth to try to privately contact people having stated they run a P390 themselves in the pastime? This might solve the "where to get the data from" problem. It will not solve my lack of VTAM wizardry.

I should have mentioned that using an LCS config command with the -e SNA option means you can only use adapter number zero with that LCS. If you want to use a non-zero adapter number you have to define the LCS using an oat.

Since I'm indeed using adapter 0, that should work.

Do you see the data from the guest that has me baffled (see comment from nine days ago)? It should be traced as 'executing command other (0x00)'.

I did see it, but I did not understand. Way too low-level for me. ;-)

mcisho commented 3 years ago

The MAC is set, however it is set when the XCA is activated, not when the tap is created, which is no use in this case. Preconfigured interface, that's what we need.

The following is a shell script I use to create the preconfigured interface I'm using to test these SNA changes:-

#! /bin/sh
#
sudo firewall-cmd --quiet --zone=trusted --remove-interface=tap223
#
sudo ip tuntap del mode tap dev tap223
#
sudo ip tuntap add mode tap dev tap223
sudo nmcli dev set tap223 managed no
sudo ip link set dev tap223 address 76:61:70:02:23:80
sudo ip link set dev tap223 mtu 1500
sudo ip link set dev tap223 up
#
sudo firewall-cmd --quiet --zone=trusted --change-interface=tap223
#

My host is Fedora. The firewall-cmd commands are there to manage Fedora's organization of the firewall, for Debian they are probably unnecessary. The nmcli command is there to stop Network Manager getting involved, again for Debian it is probably unnecessary. Having changed the interface name and MAC address to your requirements, run the shell command, then bridge the resulting tap interface.

To use the preconfigured tap I use the following Hercules config statement:-

LCS 0E43 -e SNA tap223

Having changed the device address and interface name to your requirements, bring up Hercules.

When I said "Do you see the data from the guest that has me baffled ..." I meant do you see it appearing on your Hercules console, rather than asking whether you understand it? I wanted to check that your system gets as far as mine, and also to ask whether the baffling data is the same or different on your system?

Ian

mcisho commented 3 years ago

Btw, having read more about Ethernet frames, the frames arriving from the AS/400 seem to make sense. The following is one of the examples you posted a couple of days ago:-

16:39:36 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 60 bytes (with unknown packet) from device tap0
16:39:36 HHC00979D LCS: eth frame: +0000> 0200FEDF 00420020 35B54164 00030404  .....B. 5.Ad.... ................
16:39:36 HHC00979D LCS: eth frame: +0010> BF000021 00000000 00000000 00000041  ...!...........A ................
16:39:36 HHC00979D LCS: eth frame: +0020> 00000000 00000000 00000000 00CC52D0  ..............R. ...............}
16:39:36 HHC00979D LCS: eth frame: +0030> 0000002C E0CC5001 00CC5314           ...,..P...S.     ....\.&.....    
16:39:36 HHC00951D CTC: lcs device port 00: no match found, discarding frame

The length 0x0003 is correct, the frame only contains 3-bytes of payload (the 802.2 LLC header, i.e. the 0x0404BF). However, the minimum payload an 802.3 frame can contain is 46-bytes, so 43-bytes of whatever happened to be in the output buffer was sent to make up the minimum length frame. Just as well your credit card number didn't happen to be in those 43-bytes!

PoC-dev commented 3 years ago

The MAC is set, however it is set when the XCA is activated, not when the tap is created, which is no use in this case. Preconfigured interface, that's what we need.

I understand.

The following is a shell script I use to create the preconfigured interface […]

Thanks, incorporated, works: After bringing up Hercules (without IPL, though), tcpdump outputs (broadcast) traffic being passed from the main switch to all interfaces — which is to be expected.

When I said "Do you see the data from the guest that has me baffled ..." I meant do you see it appearing on your Hercules console, rather than asking whether you understand it?

Sorry for the misunderstanding!

I wanted to check that your system gets as far as mine, and also to ask whether the baffling data is the same or different on your system?

Okay, now I get it. See below.

The length 0x0003 is correct, the frame only contains 3-bytes of payload (the 802.2 LLC header, i.e. the 0x0404BF). However, the minimum payload an 802.3 frame can contain is 46-bytes, so 43-bytes of whatever happened to be in the output buffer was sent to make up the minimum length frame. Just as well your credit card number didn't happen to be in those 43-bytes!

Thanks for the explanation! I'm not sure if the data is random. As far as I understand, it's part of the XID exchange process. See here for the OS/400 description, and here for the same in z/OS. I did not yet look up the precise payload data format, though. I suppose it's documented in the IBM SNA Formats manual, but most likely any reader needs to assemble the final frame by cross-reading many sections.

Reply to an earlier message

At present I'm just trying to understand what the messages coming from VTAM mean, and how to fool VTAM into being happy with the responses. The latest message I'm trying to understand, and which currently has me completely baffled, is:-

HHC00979D LCS: data: +0000< 00160000 00000000 00140400 000C0C99  ................ ...............r
HHC00979D LCS: data: +0010< 0003C000 00000000 01000000 0000      ..............   ..{...........  

If I understand correctly, this is what's appearing in the Hercules log when you activate the LCS? This is the complete output I get by activating:

14:36:16 HHC03992D 0:0E40 LCS: Code 03: Flags 30: Count 00000001: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000001: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code C3: Flags 20: Count 00000001: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code E4: Flags 20: Count 000000FF: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 000000F8: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code 01: Flags 20: Count 00000018: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC00981D 0:0E40 LCS: Accept data of size 24 bytes from guest
14:36:16 HHC00979D LCS: data: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
14:36:16 HHC00979D LCS: data: +0010< 00000000 00000000                    ........         ........        
14:36:16 HHC00922D 0:0E40 CTC: lcs command packet received
14:36:16 HHC00979D LCS: command: +0000< 00160000 41000000 00000000 00000000  ....A........... ................
14:36:16 HHC00979D LCS: command: +0010< 00000000 0000                        ......           ......          
14:36:16 HHC00933D 0:0E40 CTC: executing command start lan sna
14:36:16 HHC00923D 0:0E40 CTC: lcs command reply enqueue
14:36:16 HHC00979D LCS: reply: +0000> 00000000 41800000 00000100 50000000  ....A.......P... ............&...
14:36:16 HHC00979D LCS: reply: +0010> 00000000 00000800                    ........         ........        
14:36:16 HHC00966I 0:0E40 CTC: lcs triggering port 00 event
14:36:16 HHC00968D CTC: lcs device port 00: read thread: port started
14:36:16 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 90 bytes (with IPv6 packet) from device tap0
14:36:16 HHC00979D LCS: eth frame: +0000> 33330000 00160200 FEDF0042 86DD6000  33.........B..`. ............f.-.
14:36:16 HHC00979D LCS: eth frame: +0010> 00000024 0001FE80 00000000 00000000  ...$............ ................
14:36:16 HHC00979D LCS: eth frame: +0020> FEFFFEDF 0042FF02 00000000 00000000  .....B.......... ................
14:36:16 HHC00979D LCS: eth frame: +0030> 00000000 00163A00 05020000 01008F00  ......:......... ................
14:36:16 HHC00979D LCS: eth frame: +0040> 71C60000 00010400 0000FF02 00000000  q............... .F..............
14:36:16 HHC00979D LCS: eth frame: +0050> 00000000 0001FFDF 0042               .........B       ..........      
14:36:16 HHC00951D CTC: lcs device port 00: no match found, discarding frame
14:36:16 HHC00984D 0:0E40 LCS: port 00: Receive frame of size 90 bytes (with IPv6 packet) from device tap0
14:36:16 HHC00979D LCS: eth frame: +0000> 33330000 00160200 FEDF0042 86DD6000  33.........B..`. ............f.-.
14:36:16 HHC00979D LCS: eth frame: +0010> 00000024 0001FE80 00000000 00000000  ...$............ ................
14:36:16 HHC00979D LCS: eth frame: +0020> FEFFFEDF 0042FF02 00000000 00000000  .....B.......... ................
14:36:16 HHC00979D LCS: eth frame: +0030> 00000000 00163A00 05020000 01008F00  ......:......... ................
14:36:16 HHC00979D LCS: eth frame: +0040> 71C60000 00010400 0000FF02 00000000  q............... .F..............
14:36:16 HHC00979D LCS: eth frame: +0050> 00000000 0001FFDF 0042               .........B       ..........      
14:36:16 HHC00951D CTC: lcs device port 00: no match found, discarding frame
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  0  50
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=0  1  100
14:36:16 HHC03992D 0:0E40 LCS: Code 02: Flags 20: Count 000000FF: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC00982D 0:0E40 LCS: Present data of size 26 bytes to guest
14:36:16 HHC00979D LCS: data: +0000> 00180000 41800000 00000100 50000000  ....A.......P... ............&...
14:36:16 HHC00979D LCS: data: +0010> 00000000 00000800 0000               ..........       ..........      
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 000000E5: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code 01: Flags 20: Count 00000016: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC00981D 0:0E40 LCS: Accept data of size 22 bytes from guest
14:36:16 HHC00979D LCS: data: +0000< 00140000 44000001 00000000 00000000  ....D........... ................
14:36:16 HHC00979D LCS: data: +0010< 00000000 0000                        ......           ......          
14:36:16 HHC00922D 0:0E40 CTC: lcs command packet received
14:36:16 HHC00979D LCS: command: +0000< 00140000 44000001 00000000 00000000  ....D........... ................
14:36:16 HHC00979D LCS: command: +0010< 00000000                             ....             ....            
14:36:16 HHC00933D 0:0E40 CTC: executing command lan statistics sna
14:36:16 HHC00942I CTC: lcs device tap0 using mac 02:00:FE:DF:00:42
14:36:16 HHC00923D 0:0E40 CTC: lcs command reply enqueue
14:36:16 HHC00979D LCS: reply: +0000> 00000000 44800001 00000000 01040000  ....D........... ................
14:36:16 HHC00979D LCS: reply: +0010> 00000602 00FEDF00 4300               ........C.       ..........      
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  0  50
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  1  100
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=0  2  200
14:36:16 HHC03992D 0:0E40 LCS: Code 02: Flags 20: Count 000000FF: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC00982D 0:0E40 LCS: Present data of size 28 bytes to guest
14:36:16 HHC00979D LCS: data: +0000> 001A0000 44800001 00000000 01040000  ....D........... ................
14:36:16 HHC00979D LCS: data: +0010> 00000602 00FEDF00 43000000           ........C...     ............    
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 000000E3: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code 17: Flags 60: Count 00000001: Chained 00: PrevCode 00: CCWseq 0
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000001: More 00
14:36:16 HHC03992D 0:0E40 LCS: Code 01: Flags 64: Count 0000001E: Chained 40: PrevCode 17: CCWseq 1
14:36:16 HHC00981D 0:0E40 LCS: Accept data of size 30 bytes from guest
14:36:16 HHC00979D LCS: data: +0000< 00160000 00000000 00140400 000C0C99  ................ ...............r
14:36:16 HHC00979D LCS: data: +0010< 0003C000 00000000 01000000 0000      ..............   ..{...........  
14:36:16 HHC00922D 0:0E40 CTC: lcs command packet received
14:36:16 HHC00979D LCS: command: +0000< 00160000 00000000 00140400 000C0C99  ................ ...............r
14:36:16 HHC00979D LCS: command: +0010< 0003C000 0000                        ......           ..{...          
14:36:16 HHC00933D 0:0E40 CTC: executing command other (0x00)
14:36:16 HHC00923D 0:0E40 CTC: lcs command reply enqueue
14:36:16 HHC00979D LCS: reply: +0000> 00000000 00800000 00000400 000C0C99  ................ ...............r
14:36:16 HHC00979D LCS: reply: +0010> 0003C000                             ....             ..{.            
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 00000000: More 00
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  0  50
14:36:16 HHC03992D 0:0E40 LCS: Code 02: Flags 24: Count 00000800: Chained 40: PrevCode 01: CCWseq 2
14:36:16 HHC00982D 0:0E40 LCS: Present data of size 22 bytes to guest
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  1  100
14:36:16 HHC00979D LCS: data: +0000> 00140000 00800000 00000400 000C0C99  ................ ...............r
14:36:16 HHC03991D 0:0E40 LCS: device_attention rc=1  2  200
14:36:16 HHC00979D LCS: data: +0010> 0003C000 0000                        ......           ..{...          
14:36:16 HHC03993D 0:0E40 LCS: Status 0C: Residual 000007EA: More 00

What I do not understand: I can't see packets from my AS/400 directed to the on the tap side, only on eth0! This has happened before, with my ifconfig MAC change earlier, but I suspected a problem with the older ifconfig code not coping with the tap APIs. I also saw this problem after a recommendation to use the newer programs.

Old syntax, worked (but I did not yet adapt to a preconfigured tap):

/sbin/brctl addbr br0
/sbin/ifconfig eth0 up
/sbin/brctl addif br0 eth0
/sbin/bridge link set dev eth0 state 3
/sbin/ifconfig tap0 up
/sbin/brctl addif br0 tap0
/sbin/bridge link set dev tap0 state 3

New syntax, did not work before the introduction of a permanent interface also. Here is what I've used currently.

/sbin/ip link add br0 type bridge
/sbin/ip link set dev eth0 up
/sbin/ip link set dev eth0 master br0
/sbin/bridge link set dev eth0 state 3
/sbin/ip tuntap add mode tap dev tap0
/sbin/ip link set dev tap0 address 02:00:FE:DF:00:42
/sbin/ip link set dev tap0 mtu 1500
/sbin/ip link set dev tap0 up
/sbin/ip link set dev tap0 master br0
/sbin/bridge link set dev tap0 state 3

Need to investigate further.

Further help

Today I was talking to Marco Lorig. He owns a P390 with all the associated things. To summarize, he'll implant the P390 into an RS/6000. Accompanying software is available for Download from ftp://p390.ibm.com (complete archive has around 96MB). We did not set a deadline but since he's eager to help with getting SNA to run, I expect that he'll probably have the whole thing up and basically running within the next two weeks.

He's a beginner in MVS and certainly needs precise instructions what to do, though. As a probably solution, he also offered 24x7 run-time for a while, and remote access via VPN, so we can conduct first-hand tests, channel-traces and probably even packet captures. I don't know of any SNA peer systems he has to test, though. But this might help to see how activation of the LCS adapter behaves on "real hardware". I'll keep you updated.

Hopefully Marco himself will join the discussion.

mcisho commented 3 years ago

Good, glad the preconfigured interface works. I forgot to mention, the interface name is free format, e.g. mytestif, it just needs to be unique. I believe the name is limited to eight characters, though I've never tried a name that long.

No, the 0x0404BF isn't random, it's an 802.2 LLC header. The 0x04's say the destination and source are SNA Path Control, the 0xBF are control bits, though I've yet to find an explanation of what they mean.

Interesting, the baffling message on your system is exactly the same as on two of mine. All we need now is a clue to enlightenment!

Good news about the P390. Hope it allows us to make some progress.

PoC-dev commented 3 years ago

No, the 0x0404BF isn't random, it's an 802.2 LLC header. The 0x04's say the destination and source are SNA Path Control, the 0xBF are control bits, though I've yet to find an explanation of what they mean.

See the link from Fish above, to the SNA Formats documentation. It's very extensive and incredibly complex. Maybe you can find clues in Chapter 3, XID Information Fields, starting on PDF Page 61. It documents different formats, being dependent on transport (SDLC, S/390 Channel), and formats designated 0, 1, and 2. I'm still overwhelmed by that.

Interesting, the baffling message on your system is exactly the same as on two of mine. All we need now is a clue to enlightenment!

Could the SNA formats documentation help, maybe?

Good news about the P390. Hope it allows us to make some progress.

Yap. I'll keep you updated.

mcisho commented 3 years ago

The SNA Formats only describes the LLC in a couple of contexts, and I couldn't find a description in the context we see. However, I discovered a Cisco document describing LLC, so all is now clear. The 0xBF is a request to start an XID exchange, i.e. part of the normal SNA protocol, so I don't, hopefully, need to know more. Investigating the the frames was useful however, it highlighted a couple of things that needed to be changed in LCS.

The SNA Formats is of no help with the messages flowing between VTAM and the LCS itself, IBM obviously still considers them proprietary and of no concern to anyone else.

Fish-Git commented 3 years ago

Patrik Schindler (PoC-dev) wrote:

Apparently, I'm using the correct binary. From the log:

15:58:51 HHC01415I Build date: Apr  8 2021 at 15:51:01
15:58:51 HHC01417I Built with: GCC 8.3.0
15:58:51 HHC01417I Build type: GNU/Linux x86_64 host architecture build
15:58:51 HHC01417I Modes: S/370 ESA/390 z/Arch

FYI: The build date does not prove that you are running the correct binary. It only proves that whatever you built, was built on the reported date.

The only way to prove that you built the correct binary is to examine the reported version string:

20:36:41.207 HHC01413I Hercules version 4.3.0.10296-SDL-ga4db8213 (4.3.0.10296)

It is the "ghhhhhhhh" portion of the HHC01413I message's version string that reports the exact version that is being executed. The "g" in the first position indicates the repository is a "git" repository. The 8 characters following it (a4db8213) is the git repository hash of the last commit that was made to the repository (which should be universally unique across all git repositories).

The number "10296" in the "4.3.0.10296" part of the version string is simply a commit count (i.e. the total number of commits that have been made to that repository) and is not really that important. It only exists so that it can be used to construct the 4-tuple numeric version number that is placed into the resource fork of Windows executables (along with other version information too).

If you look at the list of commits (i.e. 'git log') for Ian's repository, you will see that, at the time of my writing this, the most recent commit was "0156c0778600781b0c321328435be6286b30078c". Thus, if you take the first 8 digits of that git repository commit hash value (0156c077), then the version string that message HHC01413I should report (if you indeed built the correct repository and are indeed running the resulting binary from that repository build) should be something like:

HHC01413I Hercules version 4.4.9999.nnnnn-SDL-g0156c077 (4.4.9999.nnnnn)

(The "9999" in the version string simply means this is a not-yet-released slated-to-be 4.4 still-under-development version of Hercules. That is to say, once it's finished and officially release, it will be version "4.4", but because it's still under development, we don't know what the final official version will be yet, so we use "9999" to indicate it's a development build and not an official release build.)

But the most critically important part of the version string is the git hash portion of the version string. The "ghhhhhhhh" part. That tells us exactly what version you are running.

But the build date is largely meaningless. It doesn't tell us much of anything other than the date that, whatever it was that you built, was built.

Just trying to help.

PoC-dev commented 3 years ago

Thanks Fish. Your points are all valid but for me it was the most easy proof that I'm using the binary I compiled last myself. And that was what's important. How easily one could forget to make install…

mcisho commented 3 years ago

A little more progress, the XCA will become active and remain active for 30 seconds, whereupon it stops.

16:30:35.79  V NET,ACT,ID=XCAE43E
16:30:35.80  IST097I VARY ACCEPTED
16:30:35.81  IST093I XCAE43E ACTIVE
16:31:06.18  IST380I ERROR FOR ID = XGEL00 - REQUEST: ACTLINK, SENSE: 081C0000

Presumably VTAM is expecting something from the network or the LCS, or perhaps something else needs to be activated in VTAM?

PoC-dev commented 3 years ago

Do you think it's feasible for me to try again and send packets to see if something will happen with them?

mcisho commented 3 years ago

Though I said the XCA remains active for 30 seconds, I don't believe it's actually active, but I could be wrong! Send packets, it can't do any harm, and might even produce some useful results.

vinatron commented 3 years ago

Let me know if you need any tests done I'm setting up a zOS 1.11 minimal install to use as a SNA router so I can do some tests with that.

mcisho commented 3 years ago

Please, test anything, the more testing the better.

vinatron commented 3 years ago

Will do in the process of setting up the DDR to transfer.

PoC-dev commented 3 years ago

A friend of mine owns a P390 and is currently setting it up as R390 in an RS/6000 of him. He's pretty fluent with AIX, but has no experience with MVS and successors. He has all he needs except time. Thus, progress is slow, because he's rather busy.

Goal is to have an OS/390 instance being accessible remotely for obtaining channel traces and the like while activating and deactivating VTAM resources. Foremost, the (also emulated) LCS line. I really hope this will shed some light on this topic.

vinatron commented 3 years ago

Great news I am still working out the kinks in my zOS instance.

vinatron commented 3 years ago

I may have figured out how to get OSA/SF working stay tuned. Also still working on getting DDR of zOS restored to DASD been having minor DASD issues however over all it still works.

PoC-dev commented 3 years ago

Good news, everyone! Together with my friend Marco, we today (GMT +0200) managed to get a P390 configured and running as R390 on an old RS/6000 with OS/390 ADCD. I activated a stripped-down VTAM definition for the LCS. Stripped-down to just contain the PORT definition. VTAM activates it and it stays active for more than 30 seconds without error.

We now have a platform with optional remote access to console, and TSO for obtaining channel traces to continue with implementing SNA support into LCS.

In a few days, Marco will have a Cisco router on site with SNAsw support if traces need to be made for actual packet sending/receiving.

Personally, I don't yet know how to obtain such a channel trace from within OS/390. If anybody skilled wants to volunteer, I'd be glad to connect you with Marco. Optionally, I need to research and learn how to obtain channel traces.

Ian, what's you opinion?

vinatron commented 3 years ago

I know how to grab packet captures off the lan if he has a Cisco switch. I've also been reading up on the configuration of SNASw and I'm potentially going to have some traces of SNA traffic. However it will be on a QDIO interface because I still can't get my OSE device to function. Let me know if you need any assistance with that respect not sure how you do dump traces with a R390 however I know how to get dump traces on a zSeries machine.

PoC-dev commented 3 years ago

For actual packet captures, we have other means, like a hub and a linux machine with tcpdump. Also SNAsw configuration on routers isn't complicated at all. What we need isn't packet dumps of the network side. We need dumps of the communication between the OS, and the (more or less emulated) hardware. Dumps of the channel control words (CCW). This is something between VTAM and the OS.

I guess that VTAM talking to an OSA in non-QDIO mode is mostly the same CCWs as VTAM talking to an LCS: IBM rarely reinvents the wheel. That's why I'm sure that the "control language" for the non-QDIO OSA, and LCS do not differ.

However, with a running OS/390 instance and an (emulated) LCS in SNA mode being available, we just need to find out how to obtain dumps of the communications traffic between VTAM and the hardware. And to provide these dumps here for certain functions as activating the LCS, deactivating it, amongst more.

vinatron commented 3 years ago

I'll try and attach the OSE interface to the zOS machine I have and see if I can get it to come online and see if I can do a dump trace on the channel.

vinatron commented 3 years ago

Was going to setup OSA/SF now I'm trying to get my DS6K issues resolved Kona 0 is having some minor issues.

vinatron commented 3 years ago

It came back up after some troubleshooting KONA 0 is still not 100% healthy however I can continue with testing. Also if I IPL I should be able to get some dump traces using the service user on the HMC and SE.

vinatron commented 3 years ago

It's going to take longer than I thought the DS is in poor shape it's giving me channel errors when I try and IPL.

mcisho commented 3 years ago

To trace the VTAM-LCS traffic you use GTF, the Generalized Trace Facility. The full details of GTF can be found in the manual MVS Diagnosis: Tools and Service Aids. GA22-7589-19 is the number of the z/OS 1.13 version of the manual, I don't know the number for the OS390 2.10 version. However, GTF hasn't changed that much over the years, so any version should do.

I hadn't used GTF for many, many years, and found the attached SHARE presentation a good (re)introduction to running GTF.

GTF.zip

The GTFPARM I use for doing the trace is:-

TRACE=IO,CCW
IO=(E43)
CCW=(DATA=256)

E43 is the device address of my SNA LCS.

Fish-Git commented 3 years ago

..., and found the attached SHARE presentation ...

Oops?   :)

(Hint: whatever it was that you meant to attach didn't make it; your post contains no attachment!)

PoC-dev commented 3 years ago

Ian, I didn't take my time to dig into the documentation yet. If you're really curious and want to speed up the process, I'd appreciate a terse but complete step-by-step procedure to follow for a noob like me. ;-) Please also advise how to get the data from within the OS to the outside world. Remember, I'll not run this within a Hercules environment. Thank you!

vinatron commented 3 years ago

I would assume FTP or IEBCOPY to an AWS emulated tape. I'll look into it and see if I can write a guide when I get time.

mcisho commented 3 years ago

How noob is noob? And noob to what? You saw a mainframe once upon a time? You've never heard of TSO? How low should I start?

When you first mentioned the P390 you also mentioned remote access. Is that still a possiblity? And what facilities are available on the P390? TCP/IP? FTP server?

I assumed a binary FTP from the P390, paste it here, or email direct,, and I would do the reverse at my end to look at the trace(s).

vinatron commented 3 years ago

Yes the P390 supports TCPIP on OS/390 and should work for copying the binary out.

PoC-dev commented 3 years ago

Not too noob, but I get your point. Sorry for being inadequately ambiguous.

Yes, remote-access is still a possibility. The P390 host is a standard AIX, so everything you need can be installed with minor effort.

I'll contact Marco, and forward your request. Stay tuned!

mcisho commented 3 years ago

I have tried Marco's P390, very speedy it is too, even remotely. However, I'm not clear what or where the LCS hardware is. When the XCA is activated what is VTAM communicating with? I asked Marco the same question and he replied "If i understand correctly, the P390 emulates a LAN3172 device at address 0E40 (Device type 3088) which uses the standard ethernet card of AIX.". Is this correct? And do we know whether it works or not, i.e. does it do SNA communication? I tried GTF tracing when I activated the XCA, but nothing was traced.

PoC-dev commented 3 years ago

The 0E40 device is mapped to an ethernet NIC connected to a hub or switch. It is no longer available to AIX, because of that mapping. Marco says, the card runs OK in AIX, so I guess anything else is "just" a VTAM configuration problem.

VTAM currently communicates with nothing, since Marco currently has no peer yet. Soon, a Cisco Router with SNAsw will be available. I was assuming, what we already have might help you to — at least — look at the start and stop lan commands for SNA. I'm surprised that the activation itself could not be traced.

We'll report when the Cisco Router is available. Sorry for the inconvenience!

mcisho commented 3 years ago

Huh, the lack of traced traffic was my fault, I hadn't found the necessary magic incantation, i.e. the correct GTF parameters. Now that I have, I have a trace of the XCA being activated and inactivated. Lots to study and understand! And once the Cisco is available, I expect there will be even more! Many thanks to Marco.

PoC-dev commented 3 years ago

I'm extraordinarily happy that tracing worked out eventually! I'll work with Marco to get the Router connected via SNA DLC and let you know.

Thank you for your patience and your incredible help to bring this forward!

mcisho commented 3 years ago

Good news, the LCS XCA can now be activated and inactivated. On my 2.10 under VM under Hercules the XCA can be activated/inactivated as often as one pleases, but on 2.10 under Hercules the XCA can only be activated/inactivated once. It appears the first Attention interrupt after the second activate is being ignored/lost/misinterpreted. At the moment I have no idea what's happening or why, and I don't feel inclined to spend any time investigating at present.

I haven't yet discovered how to make VTAM attempt to connect to something 'out there', and I have nothing that might attempt to connect to VTAM, so haven't seen any SNA packets flowing. I fully expect LCS will collapse in a heap when they do.

Though I have got the XCA to activate/inactivate, I have no idea what the contents of the messages coming from VTAM to the XCA mean, and have even less idea what the contents of the messages the XCA is sending to VTAM mean! Hopefully, one day, someone will be able to shed some light.

vinatron commented 3 years ago

Great news thank you for your assistance in this.

PoC-dev commented 3 years ago

Thanks a lot for your incredible effort!

About the interrupt problem: This might be related to the MIH-Change the documentation requests to be done. See chapter 13.2.1, Missing Interrupt Handler. Perhaps you might try that?

How to make VTAM do what I want issue is also a serious obstacle for me. Fortunately, Marco has managed to get the necessary RAM and flash into the Cisco Router, so I have a testbed with "should work" results.

I guess that that light can only be partially made visible by reverse engineering the P/R-390 support software. With this in mind, it's mind-boggling what the Samba team achieved, when MS was not as helpful as today. Very hard.

Again thank you! I'll get back to you when I can offer a functioning configuration.

PoC-dev commented 3 years ago

Hello,

if some reader's interested in participating in platform-independent discussions revolving around mainly SNA, maybe this will be beneficial to you:

:wq! PoC

mcisho commented 3 years ago

If anyone would like to try Hercules' LCS SNA support, the code can be found at:

The code is a branch of SDL-Hyperion-390 taken in March 2021. The code works in the limited environment it was developed in, so may not work in other environments.

Ian

PoC-dev commented 3 years ago

Expanding on that, we could successfully test:

So far we've only tested APPN connections. It would be great if people can test Subarea SNA connectivity.

wably commented 3 years ago

A suggestion if I may....

Please post some example VTAM parameters of the major node(s) that you used for success, as well as any relevant parameters from ATCSTRxx. There are too many possible parameter choices available to try to guess what you all are using that yielded success. Please include parameters for both sides of the connection where applicable, as well as the exact LCS definition itself in the Hercules configuration file.

Regards, Bob

PoC-dev commented 3 years ago

As requested, here are my tested parameters for OS/390 ADCD V2.10. These basically switch on most (if not every) capability to allow dynamic resource definition.

Compiling Hercules is out of scope for this documentation since it is documented elsewhere.

Linux Host Configuration

I'm using Debian 10 Linux on "real hardware" with a separate NIC (eth0) solely for Hercules, so unfiltered packet captures aren't too crowded with unrelated traffic. Tap0 is the "Linux End" of the virtual pipe ending at the LCS device within Hercules. This must be bridged to the outside world somehow.

You need the iproute2 package installed to have the necessary tools.

According to Hercules documentation, the MAC address is required to fulfill certain requirements about bits which must be set and others which must not be set. So I copied it literally from the example OAT to be on the safe side.

The Hercules-End of the TAP has the LSB in the rightmost byte incremented with one automatically. This incremented-one MAC is what you need to use in peer configurations.

I added the following lines to /etc/rc.local (which of course can be run manually, also):

/sbin/ip link del br0 type bridge
/sbin/ip link add br0 type bridge
/sbin/ip tuntap add mode tap dev tap0
/sbin/ip link set dev tap0 address 02:00:FE:DF:00:42
/sbin/ip link set dev tap0 mtu 1500
/sbin/ip link set dev eth0 master br0
/sbin/ip link set dev tap0 master br0
/sbin/ip link set dev eth0 up
/sbin/ip link set dev tap0 up
/sbin/ip link set dev br0 up

I am not sure if this is needed, but I've added them anyway (too much trial and error involved):

/sbin/bridge link set dev eth0 state 3
/sbin/bridge link set dev tap0 state 3

Other scenarios with a shared NIC should work also.

Hercules Configuration

Normally, you'd need just this: 0E40 LCS -e SNA tap0

For debugging purposes, I'm using two lines:

0E40    LCS  -e SNA -d tap0
t+0E40

I have not yet managed to take time for unifying a configuration for SNA and TCP/IP both running over LCS, so I can get rid of the additional CTCI TUN and the required, separately routed subnet.

VTAM configuration

Extensive documentation about VTAM configuration can be found here. Scroll down to the z/OS Communications Server section.

Of particular interest might be…

The configuration shown below is by no means refined and accordingly reduced to a working minimum set of definitions yet. Helpful input and testing is well appreciated!

SYS1.LOCAL.VTAMLST(ATCSTR00)

I left out the IPADDR and TCPNAME parameters for my EE configuration, since they are irrelevant. Apart from that, VTAM is told to be an APPN network node with no subarea functionality. The buffer parameters were left at default from the original ATCSTR00.

Create a backup copy before making changes. Just in case…

CONFIG=00,                                                             X
CONNTYPE=APPN,                                                         X
CPCDRSC=YES,                                                           X
CPCP=YES,                                                              X
DATEFORM=YMD,                                                          X
DYNLU=YES,                                                             X
DYNADJCP=YES,                                                          X
HOSTPU=P390$PU,                                                        X
NETID=APPN,                                                            X
NODETYPE=NN,                                                           X
NOPROMPT,                                                              X
NQNMODE=NQNAME,                                                        X
SSCPID=06,                                                             X
SSCPNAME=P390SSCP,                                                     X
SUPP=NOSUP,                                                            X
VERIFYCP=NONE,                                                         X
VFYRED=YES,                                                            X
CRPLBUF=(208,,15,,1,16),                                               X
IOBUF=(400,508,19,,1,20),                                              X
LFBUF=(104,,0,,1,1),                                                   X
LPBUF=(64,,0,,1,1),                                                    X
SFBUF=(163,,0,,1,1)

All other PDS members are new.

SYS1.LOCAL.VTAMLST(CDRSC)

I'm not sure if this is required for certain APPN scenarios or only when VTAM acts as a node in both Subarea and APPN networks.

         VBUILD TYPE=CDRSC
*
ILU1     CDRSC ALSLIST=ISTAPNPU

SYS1.LOCAL.VTAMLST(XCAETHNT)

This references the LCS device from the Hercules configuration at address E40. Hint: The ADCD System has all necessary IODF correctly set by default to support an LCS 3172 at E40.

XCAETH  VBUILD TYPE=XCA
*/*
ETHPRT    PORT CUADDR=E40,ADAPNO=0,SAPADDR=4,MEDIUM=CSMACD,            -
               DELAY=0,TIMER=30
*/*
ETHGRP   GROUP AUTOGEN=(10,E,X),                                       -
               CALL=INOUT,                                             -
               DIAL=YES,                                               -
               ANSWER=ON,                                              -
               DYNPU=YES,                                              -
               DYNPUPFX=ET,                                            -
               ISTATUS=ACTIVE

SYS1.LOCAL.VTAMLST(NIBBL802)

This is the switched major node defining my AS/400, called NIBBLER. NETID and CPNAME can be found by issuing a DSPNETATR. IDBLK/IDNUM, and MAXDATA can be found by looking at the line description for the LAN IOA with WRKLIND, Option 5. The DIALNO is the SAP address directly followed by the IOA's MAC address. Both to be found in the line description. SAP address 4 is the default, though.

NIB00    VBUILD TYPE=SWNET
*
NIBPU    PU    ADDR=01,                                                +
               DISCNT=NO,                                              +
               DYNLU=YES,                                              +
               MAXDATA=1496,                                           +
               NETID=APPN,                                             +
               CPNAME=NIBBLER,                                         +
               IDBLK=056,                                              +
               IDNUM=41700,                                            +
               CONNTYPE=APPN,                                          +
               CPCP=YES,                                               +
               DWACT=YES,                                              +
               HPR=NO,                                                 +
               MAXOUT=7,                                               +
               PACING=7,                                               +
               VPACING=7,                                              +
               MODETAB=ISTINCLM,                                       +
               SSCPFM=USSSCS,                                          +
               ISTATUS=ACTIVE,                                         +
               PUTYPE=2
*
NIBPTH   PATH  GRPNM=ETHGRP,                                           +
               DIALNO=0004002035B54164

SYS1.LOCAL.VTAMLST(ATCCON00)

As usual, this contains the resources to be activated at VTAM start time. I removed some apparently unneeded resources just giving error messages at startup time.

A0600,NSNA70X,NSNA90X,DYNMODEL,COSAPPN,CDRSC,                          -
A0TCP,XCAETHNT,P390APP,NIBBL802

Making it run

AFAIK it's not possible to dynamically reconfigure VTAM to the extent necessary with the new ATCSTR00. So, make the changes in ATCSTR, but leave out the switched major node for the other SNA node from ATCSTR00, and re-IPL. Alternatively, stop TCAM and VTAM, and restart both in reverse order.

I've observed that having the switched major node being activated while VTAM starts up, the connection won't come up. I don't know if this is a VTAM timing/retry configuration issue, or related to side effects with the new SNA code in LCS.

Issuing the proper vary command on the console for the major node yields:

- 17.56.37           V NET,ID=NIBBL802,ACT
  17.56.37 STC00004  IST097I VARY ACCEPTED
  17.56.37 STC00004  IST093I NIBPU ACTIVE
  17.56.37 STC00004  IST093I NIBBL802 ACTIVE
  17.56.37 STC00004  IST590I CONNECTOUT ESTABLISHED FOR PU NIBPU ON LINE
   E0E40000
  17.56.37 STC00004  IST1086I APPN CONNECTION FOR APPN.NIBBLER IS ACTIVE
   - TGN = 21
  17.56.37 STC00004  IST1096I CP-CP SESSIONS WITH APPN.NIBBLER ACTIVATED

For APPN Connections from End Node to (only one!) Network Node, and Network Node to Network Node to fully function, it's crucial to see CP-CP Sessions come up. I successfully tested the connection with APING in both directions. Output from VTAM:

04- 17.56.54           D NET,APING,ID=NIBBLER,LOGMODE=#INTER                  
    17.56.54 STC00003  IST097I DISPLAY ACCEPTED                               
    17.56.54 STC00003  IST1489I APING SESSION INFORMATION                     
    IST1490I DLU=APPN.NIBBLER SID=FD8F3F629B0B3271                          
    IST933I LOGMODE=#INTER  , COS=#INTER                                      
    IST875I APPNCOS TOWARDS SLU = #INTER                                      
    IST1460I TGN  CPNAME             TG TYPE      HPR                         
    IST1461I  21  APPN.NIBBLER       APPN         *NA*                        
    IST314I END                                                               
    17.56.55 STC00003  IST1457I VTAM APING VERSION 2R33 (PARTNER TP VERSION   
     2R43)                                                                    
    IST1490I DLU=APPN.NIBBLER SID=FD8F3F629B0B3271                          
    IST1462I ECHO IS ON                                                       
    IST1463I ALLOCATION DURATION: 34 MILLISECONDS                             
    IST1464I PROGRAM STARTUP AND VERSION EXCHANGE: 723 MILLISECONDS           
    IST1465I         DURATION      DATA SENT   DATA RATE   DATA RATE          
    IST1466I      (MILLISECONDS)    (BYTES)  (KBYTE/SEC)  (MBIT/SEC)          
    IST1467I               17            200          11           0          
    IST1467I               16            200          12           0          
00  IST1468I TOTALS:       33            400          12           0          
    IST1469I DURATION STATISTICS:                                             
    IST1470I MINIMUM = 16 AVERAGE = 16 MAXIMUM = 17                           
    IST314I END                                                               

My VTAM configuration also consists of an Enterprise Extender (SNA over UDP) definition (not shown). Sessions forwarded by VTAM between my AS/400 (because of it's old OS only being able to support 802.2 DLC connections), and a newer IBM i 7.2 machine (because of it's new OS only being able to support SNA over Enterprise Extender) work flawlessly.

Reporting Problems

First, please always make sure you use the most recent code base. It's completely sufficient to run git pull && make && make install. No configure and time consuming full-compile necessary after the initial run.

In addition to providing the usual information for understanding any issues, please include

Huge thanks to Ian Shorter for his incredible work in making SNA over LCS possible.

Thanks to Marco Lorig for providing a remotely accessible R/390 environment for initial VTAM traces, and packet dumps on a "real" LCS.

Thanks to Jeff Snyder for providing an initial VTAM configuration to work with.

PoC-dev commented 3 years ago

I just tested connecting MS SNA Server as 3270/tn3270 converter via 802.2 DLC to the OS/390 instance described above. No VTAM configuration changes, just gleaning PU and LU names from the console messages, adjusting the settings in MS SNA Server accordingly. Could successfully connect with the accompanying MS SNA Server 3270 client. This is relying on local Windows users, so I don't know how to make stock tn3270 clients connect successfully. Yet.