Closed gjcullen closed 3 years ago
When TCPIP under zVM54 is brought down using the external interruption command:
#CP EXTERNAL
I myself have never had the need to specifically shutdown only(?) TCPIP, so I'm not familiar with this technique. Which z/VM userid do you enter this command in? What z/VM userid do you need to logon to, to enter this command? Operator? TCPIP? TCPMAINT?
I employ the zVM Programmable Operator (PROP) for automation. The TCPIP machine is accessible through Secondary Console Image Facility (SCIF), Both PROP and SCIF have been part of VM and zVM for a long time.
To bring down TCPIP you issue a command from an authorized user: #CP SMSG OPERATOR CMD NETSTAT CP EXTERNAL
. This is usually done in an automated fashion. You could also issue it manually, or to simplify, just logon to the TCPIP machine itself and issue #CP EXT
to cause an external interruption.
(The reason that I am shutting down only TCPIP is for isolation of the problem).
Geoff
Geoff, (@gjcullen)
I did some tests on both z/VM 7.1 using QETH/OSA (not CTCI) as well as on z/VM 6.3 using LCS as well as using CTCI (all on Windows(*)), and in none of my tests did z/VM crash. TCPIP appeared to be shutdown as expected and the system keep running just fine. There was no IFI004 crash.
So I was unable to recreate your problem with the z/VM systems I have. (I do not have z/VM 5.4 so I am unable to test with your exact (or semi-exact) environment.)
I will have to think about what type of tracing/debugging you might be able to do to help us try and determine where things are going wrong, but in the mean time is there any way for you to upgrade to a more recent version of z/VM? Or, barring that, have you checked to see if maybe there are some APAR(s)(?) (fixes) for z/VM 5.4 that maybe you're missing?
I'll keep this issue open for a little longer to give me time to maybe come up with a plan of attack, but at the moment it looks like the ultimate resolution might end up being "Unknown". :(
Sorry! :(
(But give me a few days to think about things. There might be some debugging/information gathering we might be able to do. I don't know yet.)
_(*) I have Linux (KDE Neon) installed in a VMware virtual machine on my Windows system, but I only use it to make sure Hercules builds and runs okay on Linux. I don't really use it for any serious Linux testing since not only do I know very little about Linux (I'm a Windows person, not a Linux person), but in all my years as a Hercules developer, I have unfortunately never been able to get networking working correctly on any of my Linux systems._ :(
Here are 3270 screen grab snippets along with my comments for the tests I performed:
z/VM 7.1 using QETH/OSA:
CP EXT
09:48:44 DTCSHU004E
09:48:44 DTCSHU009E 1 active client, with 1 connection in use.
09:48:44 DTCSHU010E I will delay shutting down for 30 seconds, so that
09:48:44 DTCSHU011E RSTs and shutdown notifications may be delivered.
09:48:44 DTCSHU012E If you wish to shutdown immediately, without warning,
09:48:44 DTCSHU013E type #CP EXT again.
09:48:44 DTCSHU014I
09:48:44 DTCSTM119I Server Telnet closed down. Bye.
CP DISC
DISCONNECT AT 09:49:00 PST MONDAY 03/08/21
Press enter or clear key to continue
(I probably should have waited longer (for the 30 seconds to expire) before disconnecting! Sorry!)
09:47:53 GRAF 0460 RECONNECT TCPIP USERS = 10 BY IBMVM1
09:49:00 GRAF 0460 DISCONNECT TCPIP USERS = 10
09:49:14 USER DSC LOGOFF AS TCPIP USERS = 9
09:49:14 OSA 0600 DETACHED TCPIP 0600 BY TCPIP
09:49:14 OSA 0601 DETACHED TCPIP 0601 BY TCPIP
09:49:14 OSA 0602 DETACHED TCPIP 0602 BY TCPIP
No z/VM 7.1 crash. No IFI004. System still running just fine!
z/VM 6.3 using LCS:
No z/VM 6.3 crash. No IFI004. System still running just fine!
z/VM 6.3 using CTCI:
CP EXT
***** 03/08/21 *****
11:19:30 DTCSHU004E
11:19:30 DTCSHU009E 2 active clients, with 2 connections in use.
11:19:30 DTCSHU010E I will delay shutting down for 30 seconds, so that
11:19:30 DTCSHU011E RSTs and shutdown notifications may be delivered.
11:19:30 DTCSHU012E If you wish to shutdown immediately, without warning,
11:19:30 DTCSHU013E type #CP EXT again.
11:19:30 DTCSHU014I
11:19:30 DTCSTM119I Server Telnet closed down. Bye.
USER DSC LOGOFF AS FTPSERVE USERS = 11
11:20:00 DTCCTC082E CTCA shutting down:
11:20:00 DTCPRI385I Device CTCDEV1:
11:20:00 DTCPRI386I Type: CTC, Status: Ready
11:20:00 DTCPRI387I Envelope queue size: 0
11:20:00 DTCPRI388I Address: 0E20
11:20:00 DTCPRI280I ShutDown at 76.775 seconds
DTCRUN1014I Server ended normally at 11:20:00 on 8 Mar 2021 (Monday)
DTCRUN1019I Server will not be logged off because you are connected
CON FILE 0001 SENT TO TCPMAINT RDR AS 0043 RECS 0132 CPY 001 T NOHOLD NOKEEP
Ready; T=1.14/1.34 11:20:00
No z/VM 6.3 crash. No IFI004. System still running just fine!
The Hercules logs for each test were unremarkable as well. Nothing unusual. No crash. No IFI004, and after waiting for a minute or three, z/VM's shutdown proceeded normally as well. Nothing unusual.
As I said, this is all under Windows. Maybe Linux tuntaps behave differently? I don't know. As I explained I am unable to perform any type of networking tests on my Linux VMware system.
I still have not figured out what type of debugging/tracing can be done to help determine why this is happening for you. I personally suspect it might be a bug in your z/VM 5.4. Where did you get it? Have you looked for any maintenance that can be applied?
When the line in the configuration reads:
E200.2 3088 CTCI /dev/net/tun 1500 192.168.100.1 192.168.100.2 255.255.255.0I will have NO connectivity across the CTC, but I will have no problems during the TCPIP shutdown.
This is hardly surprising. Device type 3088 is unknown/invalid:
19:03:52.520 HHC01462E 0:0E20 devtype 3088 not recognized 19:03:52.520 HHC00007I Previous message from function 'attach_device' at config.c(1300)
As a result, device 0E20 and 0E21 were never added to your Hercules configuration and thus your z/VM system did not have any type of networking device available to it, thereby causing TCPIP to never get started. Thus it comes to no surprise that you had no networking connectivity as well as no problems shutting down TCPIP since it was never started in the first place!
How did you configure your z/VM's networking? May I see your PROFILE TCPIP
and SYSTEM DTCPARMS
files?
Have you (with the possible help from someone else) tried analyzing the z/VM dump? Does it reveal anything unusual?
I'm stuck! :(
Hi Fish,
I've run into this or a very similar issue as well. If I run VM/ESA 2.4.0 with an OS/390 2.10 guest and I shut down TCPIP normally (i.e. #CP EXT) in VM/ESA, no problem, but if I shutdown TCPIP in the OS/390 guest (P TCPIP), VM/ESA blows up with the IFI004.
My Hercules config statement for the OS/390 TCPIP CTC looks like: 5EE.2 CTCI -s 255.255.255.0 192.168.1.105 192.168.1.32 # TCPIP to LAN I have confirmed that IP works to/from OS/390 before the failure. If I run OS/390 outside of VM/ESA, it appears to shut down normally. I don't see any abends or Hercules issues.
I am running a recent build of Hercules on Windows 10. I downloaded it on 2/21. HHC01413I Hercules version 4.4.9999.0-SDL (4.4.9999.0)
What can I provide to help resolve this issue?
Fish,
I have gotten closer to the problem area, with more details (this time repeating the same problems with a zVM7.1 image:
The abend is triggered, not by the shutdown of TCPIP stack, but by the stopping of the Channel-to-Channel connection between zVM TCPIP and the Linux host system. I am beginning to think at TUN/TAP interaction. Below are the CTC definitions and the console logs of the TCPIP shutdown sequences.
I will continue to narrow the problem. I appreciate your time on this matter. Respectfully, Geoff
Hercules Configuration snippet defining the CTC connection:
E200.2 CTCI /dev/net/tun 1500 192.168.100.3 192.168.100.2 255.255.255.0
zVM TCPIP Configuration snippet defining the CTC connection:
; -----------------------------------------------------------------------
DEVICE CTCDEV1 CTC E200
LINK CTCLINK1 CTC 0 CTCDEV1
;
HOME
; IP Address Subnet Mask Link Name
; --------------- --------------- ------------
192.168.100.3 255.255.255.0 CTCLINK1
;
PRIMARYINTERFACE CTCLINK1
GATEWAY
;
; (IP) Network Subnet First Link MTU
; Address Mask Hop Name Size
; ----------- ------------- --------------- ---------- -----
192.168.100.2 255.255.255.0 = CTCLINK1 1500
;
DEFAULTNET 192.168.100.2 CTCLINK1 1500
START CTCDEV1
Directory entry for TCPIP:
IDENTITY TCPIP XXXXXXXX 128M 256M BG
INCLUDE TCPCMSU
BUILD ON * USING SUBCONFIG TCPIP-1
IUCV ALLOW
IUCV ANY PRIORITY
IUCV *CCS PRIORITY MSGLIMIT 255
IUCV *VSWITCH MSGLIMIT 65535
MACHINE XA
OPTION QUICKDSP SVMSTAT MAXCONN 1024 DIAG98 APPLMON
SHARE RELATIVE 3000
DEDICATE E200 E200
DEDICATE E201 E201
CONSOLE 0009 3215 T OPERATOR
zVM Programmable Operator log snippet showing network shutdown:
---------------------------------------
Commencing VM Network Shutdown sequence
----------------------------------------
RSCS : 20:05:00 DMTCMX005I Location CPZVM71B(OPERATOR) executing: SHUTDOWN
RSCS : 20:05:00 DMTNCR570I Link CPZVM54A now set to deactivate
RSCS : 20:05:00 DMTPRD779I TCP Port redirector stopped
RSCS : 20:05:00 DMTTNE183I Link CPZVM54A session terminated
RSCS : 20:05:00 DMTMAN002I Link CPZVM54A deactivated
RSCS : 20:05:00 DMTNCR570I Link CPZVM71A now set to deactivate
RSCS : 20:05:00 DMTTNE183I Link CPZVM71A session terminated
RSCS : 20:05:00 DMTMAN002I Link CPZVM71A deactivated
RSCS : 20:05:00 DMTMAN100I RSCS Networking terminated
RSCS : CONNECT= 00:00:22 VIRTCPU= 000:00.09 TOTCPU= 000:00.19
RSCS : LOGOFF AT 20:05:01 EST TUESDAY 03/09/21
20:05:01 USER DSC LOGOFF AS RSCS USERS = 26
GCS : USER DSC LOGOFF AS RSCS USERS = 26
VM TCP/IP Netstat Level 710 TCP/IP Server Name: TCPIP
TCPIP : 20:05:01 DTCMON442I CP EXT issued by OPERATOR; results pending
TCPIP : 20:05:01 DTCMON443I CP EXT issued by OPERATOR; command successful
TCPIP : ***** 03/09/21 *****
TCPIP : 20:05:01 DTCSHU004E
TCPIP : 20:05:01 DTCSHU009E 6 active clients, with 2 connections in use.
TCPIP : 20:05:01 DTCSHU010E I will delay shutting down for 30 seconds, so that
TCPIP : 20:05:01 DTCSHU011E RSTs and shutdown notifications may be delivered.
TCPIP : 20:05:01 DTCSHU012E If you wish to shutdown immediately, without warni
TCPIP : 20:05:01 DTCSHU013E type #CP EXT again.
TCPIP : 20:05:01 DTCSHU014I
TCPIP : 20:05:01 DTCSTM119I Server Telnet closed down. Bye.
CP return code = 0
VM TCP/IP Netstat Level 710 TCP/IP Server Name: DTCVSW1
CP return code = 0
VM TCP/IP Netstat Level 710 TCP/IP Server Name: DTCVSW2
CP return code = 0
GCS : CONNECT= 00:00:23 VIRTCPU= 000:00.00 TOTCPU= 000:00.03
GCS : LOGOFF AT 20:05:01 EST TUESDAY 03/09/21
20:05:01 USER DSC LOGOFF AS GCS USERS = 25
Command complete
PROP200H: The status of RSCS is changed from UP to QUIESCING
HCPMSG045E RSCS not logged on
PROP200H: The status of RSCS is changed from QUIESCING to DOWN
HCPMSG045E RSCS not logged on
PROP200H: The status of TCPIP is changed from UP to QUIESCING
20:05:01 USER DSC LOGOFF AS DTCVSW1 USERS = 24
20:05:01 USER DSC LOGOFF AS DTCVSW2 USERS = 23
20:05:01 USER DSC LOGOFF AS FTPSERVE USERS = 22
TCPIP : USER DSC LOGOFF AS FTPSERVE USERS = 22
TCPIP : 20:05:31 DTCCTC082E CTCA shutting down:
TCPIP : 20:05:31 DTCPRI385I Device CTCDEV1:
TCPIP : 20:05:31 DTCPRI386I Type: CTC, Status: Ready
TCPIP : 20:05:31 DTCPRI387I Envelope queue size: 0
TCPIP : 20:05:31 DTCPRI388I Address: E200
IFI004 ABEND occurs after this point
TCPIP console log snippet showing TCPIP shutdown:
20:04:39 DTCCTC095E Sense id: FF 30 88 for device E200
20:04:39 DTCCTC095E Sense id: FF 30 88 for device E201
20:04:39 DTCSTM213I Telnet server: Global connection to *CCS CP System Service established
20:04:39 DTCSTM216I Telnet server: First line of *CCS logo is: z/VM ONLINE
--CPZVM71B--PRESS BREAK KEY TO BEGIN SESSION
-------------------------------------------------
Telnet and the Local VM Network is UP and READY
-------------------------------------------------
20:05:01 DTCMON442I CP EXT issued by OPERATOR; results pending
20:05:01 DTCMON443I CP EXT issued by OPERATOR; command successful
***** 03/09/21 *****
20:05:01 DTCSHU004E
20:05:01 DTCSHU009E 6 active clients, with 2 connections in use.
20:05:01 DTCSHU010E I will delay shutting down for 30 seconds, so that
20:05:01 DTCSHU011E RSTs and shutdown notifications may be delivered.
20:05:01 DTCSHU012E If you wish to shutdown immediately, without warning,
20:05:01 DTCSHU013E type #CP EXT again.
20:05:01 DTCSHU014I
20:05:01 DTCSTM119I Server Telnet closed down. Bye.
Fish, I created an OBEYFILE for TCPIP to surgically stop the channel-to-channel adapter. Now with TCPIP in the initialized and fully "up" state (as opposed to shutting down), if I execute the OBEYFILE, an IFI004 CP Abend will immediately follow. Same on both zVM54 and zVM71.
Geoff
Okay, I managed to modify my existing z/VM 7.1 system to use CTCI, and once again, I experienced zero problems shutting down TCPIP. No IFI004. I even re-started TCPIP by logging completely off of the TCPIP user ID and then logging back on again, and sure enough, TCPIP networking came up just fine and everything worked.
Again, this is on Windows, where networking is handled completely differently than on Linux, so I can only guess that whatever the problem is that's causing your IFI004. it's unique to Linux tuntap only. Someone else more experienced with Linux tuntap networking and Hercules is going to have to take over from here. I don't think there's anything more I can do. :(
Here's my Hercules CTCI statement:
0E20.2 CTCI -m 02-00-5E-A8-00-47 192.168.0.71 192.168.0.107
Here's my TCPMAINT's PROFILE TCPIP
and SYSTEM DTCPARMS
files:
(profile tcpip):
00001 ; ----------------------------------------------------------------------
00002 ; - PROFILE TCPIP created by DTCIPWIZ EXEC on 11 Jun 2020
00003 ; - Configuration program run by MAINT710 at 18:21:02
00004 ; ----------------------------------------------------------------------
00005 ; %%File Origin Indicator - DO NOT REMOVE OR ALTER the next line%%
00006 ; %%TCPIP%%PROFILE%%STCPIP%%
00007 ; ----------------------------------------------------------------------
00008 OBEY
00009 OPERATOR TCPMAINT MAINT MPROUTE REXECD SNMPD SNMPQE LDAPSRV MAINT710
00010 ENDOBEY
00011 ; ----------------------------------------------------------------------
00012 PORT
00013 ; 20 TCP FTPSERVE NOAUTOLOG ; FTP Server
00014 ; 21 TCP FTPSERVE ; FTP Server
00015 23 TCP INTCLIEN ; TELNET Server
00016 ; 25 TCP SMTP ; SMTP Server
00017 ; 111 TCP PORTMAP ; Portmap Server
00018 ; 111 UDP PORTMAP ; Portmap Server
00019 ; 161 UDP SNMPD ; SNMP Agent
00020 ; 162 UDP SNMPQE ; SNMPQE Agent
00021 ; 389 TCP LDAPSRV ; LDAP Server
00022 ; 389 UDP LDAPSRV ; LDAP Server
00023 ; 512 TCP REXECD ; REXECD Server (REXEC)
00024 ; 514 TCP REXECD ; REXECD Server (RSH)
00025 ; 520 UDP MPROUTE NOAUTOLOG ; Multiple Protocol Routing Server
00026 ; 608 TCP UFTD ; UFT Server
00027 ; 636 TCP LDAPSRV ; LDAP Server (Secure)
00028 ; 636 UDP LDAPSRV ; LDAP Server (Secure)
00029 ; 2049 UDP VMNFS ; NFS Server
00030 ; 2049 TCP VMNFS NOAUTOLOG ; NFS Server
00031 ; ----------------------------------------------------------------------
00032 DEVICE CTCDEV CTC 0E20
00033 LINK CTCLINK CTC 0 CTCDEV
00034 ; (End DEVICE and LINK statements)
00035 ; ----------------------------------------------------------------------
00036 ; ----------------------------------------------------------------------
00037 HOME
00038 192.168.0.71 255.255.0.0 CTCLINK
00039 ; (End HOME Address information)
00040 ; ----------------------------------------------------------------------
00041 GATEWAY
00042 ; Network Subnet First Link MTU
00043 ; Address Mask Hop Name Size
00044 ; ------------- --------------- --------------- ---------------- -----
00045 DEFAULTNET 192.168.0.99 CTCLINK 1492
00046 ; (End GATEWAY Static Routing information)
00047 ; ----------------------------------------------------------------------
00048 START CTCDEV
00049 ; (End START statements)
00050 ; ----------------------------------------------------------------------
(system dtcparms):
00001 .**********************************************************************
00002 .* SYSTEM DTCPARMS created by DTCIPWIZ EXEC on 11 Jun 2020
00003 .* Configuration program run by MAINT710 at 18:20:48
00004 .**********************************************************************
00005 :nick.TCPIP :type.server
00006 :class.stack
00007 :attach.0E20-0E21
Here's me issuing CP EXT
from the TCPIP user id to shutdown TCPIP networking:
CP EXT
***** 03/10/21 *****
17:08:20 DTCSHU004E
17:08:20 DTCSHU009E 3 active clients, with 1 connection in use.
17:08:20 DTCSHU010E I will delay shutting down for 30 seconds, so that
17:08:20 DTCSHU011E RSTs and shutdown notifications may be delivered.
17:08:20 DTCSHU012E If you wish to shutdown immediately, without warning,
17:08:20 DTCSHU013E type #CP EXT again.
17:08:20 DTCSHU014I
17:08:20 DTCSTM119I Server Telnet closed down. Bye.
17:08:50 DTCCTC082E CTCA shutting down:
17:08:50 DTCPRI385I Device CTCDEV:
17:08:50 DTCPRI386I Type: CTC, Status: Ready
17:08:50 DTCPRI387I Envelope queue size: 0
17:08:50 DTCPRI388I Address: 0E20
17:08:50 DTCPRI280I ShutDown at 197.838 seconds
DTCRUN1014I Server ended normally at 17:08:50 on 10 Mar 2021 (Wednesday)
DTCRUN1019I Server will not be logged off because you are connected
CON FILE 0001 SENT TO TCPMAINT RDR AS 0036 RECS 0100 CPY 001 T NOHOLD NOKEEP
Ready; T=2.31/2.66 17:08:51
logoff
Here's what appears on the operator console:
17:08:12 GRAF 0460 RECONNECT TCPIP USERS = 10 BY IBMVM1
17:10:34 GRAF 0460 LOGOFF AS TCPIP USERS = 9
17:10:34 CTCA 0E20 DETACHED TCPIP 0E20 BY TCPIP
17:10:34 CTCA 0E21 DETACHED TCPIP 0E21 BY TCPIP
_Again, no IFI004 at all. System stays up and running just fine._
Here's me logging onto user id TCPIP again to restart TCPIP:
DMSWSP100W Shared Y-STAT not available
DMSACP723I D (198) R/O
DMSACP723I E (591) R/O
DMSACP723I F (592) R/O
Ready; T=0.05/0.07 17:12:26
DTCRUN1022I Console log will be sent to default owner ID: TCPMAINT
DTCRUN1046I Using TCP/IP data file: TCPIP DATA F1
DTCRUN1046I Using 'server' definition (TCPIP) from file: SYSTEM DTCPARMS D1
DTCRUN1046I Using 'class' definition (STACK) from file: IBM DTCPARMS E1
DTCRUN1096I STORAGE = 128M
DTCRUN1038I Server is not configured to support secure connections
DTCRUN1021R To cancel TCP/IP Stack startup, type any non-blank character and
press ENTER. To continue startup, just press ENTER.
DTCRUN1011I Server started at 17:12:30 on 10 Mar 2021 (Wednesday)
DTCRUN1011I Running server command: TCPIP
DTCRUN1011I No parameters in use
DTCRUN1048I SLVL service information for: TCPIP MODULE E2
DTCRUN1048I CMHOSTN ZVM620 CMPRCOM PI40702 CMTCPPR PI21053
DTCRUN1048I CMUNTOK ZVM620 FPIPDOW PH01192 FPI6DOW PH01192
DTCRUN1048I FPI6UP PI08319 FPNOTIF PH01192 FPPARSE ZVM630
DTCRUN1048I FPQDIO PI73016 FPQUEUE PH01192 FPROUND PH01192
DTCRUN1048I FPSCHED PH01192 FPSOCKRE PH01192 FPTCPDOW PH01192
DTCRUN1048I FPTCPREQ PH01192 FPTCPUP PH01192 FPTIMER ZVM620
DTCRUN1048I FPTOCLAW PQ95126 FPTOCTC PQ70168 FPTOPC3 PQ59871
DTCRUN1048I FPTREE ZVM620 FPUTIL PH01192 F6TCPUP PH01192
DTCRUN1048I TCARP ZVM710 TCCONFIG ZVM630 TCCONSI ZVM620
DTCRUN1048I TCDIP98 PQ19769 TCFPSM ZVM620 TCFR182 ZVM620
DTCRUN1048I TCINT MT06108 TCIPDOW ZVM630 TCIPUP PI08319
DTCRUN1048I TCMIB PI99184 TCMON PH01192 TCMPRIO ZVM620
DTCRUN1048I TCNED PI08319 TCNOTIF ZVM630 TCOFFLOP PQ21340
DTCRUN1048I TCPARSE PI99184 TCPDOWN ZVM620 TCPERUP PQ49456
DTCRUN1048I TCPEXITS ZVM620 TCPIP PI99184 TCPIPIN ZVM710
DTCRUN1048I TCPKT ZVM620 TCPREQU PH00032 TCPRINT PH01192
DTCRUN1048I TCPSSL PH01192 TCPUP PH01192 TCQDIO PI21053
DTCRUN1048I TCQUEUE PH01192 TCROUND ZVM620 TCSOCKRE PI97021
DTCRUN1048I TCSYSIO PQ51738 TCTOATM ZVM620 TCTOCLAW PQ95126
DTCRUN1048I TCTOCTC ZVM620 TCTOHPPI PQ51891 TCTOIUC ZVM630
DTCRUN1048I TCTOOSD PI52850 TCTOPC3 PQ66267 TCTREEP PI68525
DTCRUN1048I TCUDPRE ZVM620 TCUDPUP ZVM630 TCUTIL PI99184
DTCRUN1048I TCVMSUB ZVM620 TNLDSFP PI38762 TNSTMAS PI99184
DTCRUN1048I TNTOCP PI25555 TNTOTCP PI99184 T6PREQU PM96884
DTCRUN1048I T6PSSL PI97021 T6SOCKRE ZVM650
DTCTCP001I z/VM TCP/IP Level 710
***** 03/10/21 *****
17:12:30 DTCIPI008I Initializing... TCPIP MODULE E2 dated 03/13/19 at 16:49
17:12:30 DTCIPI052I TCP/IP Module Load Address: 00BD1000
17:12:30 DTCIPI009I Devices will use diagnose 98 real channel program support
17:12:30 DTCIPI012I TCP/IP running under z/VM system
17:12:30 DTCIPI005I Trying to open TCPIP TCPIP *
17:12:30 DTCIPI005I Trying to open ZVM71 TCPIP *
17:12:30 DTCIPI005I Trying to open PROFILE TCPIP *
17:12:30 DTCIPI006I Using profile file PROFILE TCPIP D1 dated 03/10/21 at 16:33
17:12:31 DTCFPM004I FPSM is available with 0 pages allocated
17:12:31 DTCFPM005I FPSM may allocate an additional 23348 pages out of 25943
17:12:31 DTCCTC080I CTCA initializing:
17:12:31 DTCPRI385I Device CTCDEV:
17:12:31 DTCPRI386I Type: CTC, Status: Not started
17:12:31 DTCPRI387I Envelope queue size: 0
17:12:31 DTCPRI388I Address: 0E20
17:12:32 DTCPUP096I Precedence designations will be overridden
17:12:32 DTCPDO083I Will honor ICMP redirects
17:12:32 DTCPDO226I Will honor ICMPv6 redirects
17:12:32 DTCPDO150I Support for IPv4 Path MTU Discovery is disabled
17:12:32 DTCIPU044I IP forwarding is enabled
17:12:32 DTCUDP010I Limit on incoming UDP datagram queue size enabled
17:12:53 DTCIPI035I TCP/IP will not dump if a program check occurs
17:12:53 DTCIPI036I Access to TCP/IP services is not restricted
17:12:53 DTCIPI033I Only users in the obey list and users who have a port explic
itly reserved may use ports 1-1023
17:12:53 DTCIPI038I Internal consistency checking is disabled
17:12:53 DTCIPI039I Support for RFC 1323 is enabled
17:12:53 DTCSSL037I Local connections will not be protected by SSL
17:12:53 DTCIPI023I TCP-IP initialization complete.
17:12:53 DTCSTM237I Telnet server: Using port 23
17:12:53 DTCSTM298I Telnet server: Line mode input will be presented when reques
ted
17:12:53 DTCSTM296I Telnet server: Erase All Unprotected (EAU) data will be tran
smitted
17:12:53 DTCSTM239I Telnet server: No inactivity timeout
17:12:53 DTCSTM240I Telnet server: Every 600 seconds a timing mark option packe
will be sent.
17:12:53 DTCSTM299I Telnet server: EOJTIMEOUT is 120 seconds
17:12:53 DTCSTM269I Telnet server: SCANInterval is 60 seconds
17:12:53 DTCSTM255I Telnet server: Suppress-Go-Ahead enabled
17:12:53 DTCSTM291I Telnet server: TN3270E is enabled
17:12:53 DTCOTC050I ConnectExit is disabled for TN3270E sessions
17:12:53 DTCSTM256I Telnet server: Line-mode terminal names will be prefixed wit
h "TCPIP"
17:12:53 DTCSTM263I Telnet server: Will use logical device addresses in the rang
e 00000000 through 00000FFF
17:12:53 DTCSTM305I Telnet server: Secure Connections are NOT ALLOWED
17:12:53 DTCSTM313I Telnet server: TLSLABEL is ignored when the Secure Connectio
ns value is NEVER.
17:12:53 DTCSTM243I ******************************************
17:12:53 DTCSTM244I Log of IBM TCP/IP Telnet Server Users started on 03/10/21 at
17:12:53
17:12:53 DTCSTM245I
17:12:53 DTCCTC095E Sense id: FF 30 88 for device 0E20
17:12:53 DTCSTM213I Telnet server: Global connection to *CCS CP System Service e
stablished
17:12:53 DTCCTC095E Sense id: FF 30 88 for device 0E21
17:12:53 DTCSTM216I Telnet server: First line of *CCS logo is: z/VM ONLINE
--ZVM71 --PRESS BREAK KEY TO BEGIN SESSION
CP DISC
DISCONNECT AT 17:13:55 PST WEDNESDAY 03/10/21
Press enter or clear key to continue
_And again, no IFI004 at all. System stays up and running just fine, and all networking functionality appears to be fully functional and once again working fine._
Geoff,
THREE things:
IDENTITY TCPIP
, you have two "dedicate" statements for your Hercules CTCI device:
IDENTITY TCPIP XXXXXXXX 128M 256M BG INCLUDE TCPCMSU BUILD ON * USING SUBCONFIG TCPIP-1 IUCV ALLOW IUCV ANY PRIORITY IUCV *CCS PRIORITY MSGLIMIT 255 IUCV *VSWITCH MSGLIMIT 65535 MACHINE XA OPTION QUICKDSP SVMSTAT MAXCONN 1024 DIAG98 APPLMON SHARE RELATIVE 3000 DEDICATE E200 E200 DEDICATE E201 E201 CONSOLE 0009 3215 T OPERATOR
I don't believe these are needed. Instead, I believe the devices are automatically attached to user TCPIP via the SYSTEM DTCPARMS
file's ":attach." statement. (Refer to my SYSTEM DTCPARMS
file in my previous comment further above)
Have you tried removing them?
I noticed your PROFILE TCPIP
file has an extra statement between the GATEWAY
statement and the DEFAULTNET
statement that my own PROFILE TCPIP file does not have:
[...] GATEWAY ; ; (IP) Network Subnet First Link MTU ; Address Mask Hop Name Size ; ----------- ------------- --------------- ---------- ----- 192.168.100.2 255.255.255.0 = CTCLINK1 1500 ; DEFAULTNET 192.168.100.2 CTCLINK1 1500 START CTCDEV1
In my own PROFILE TCPIP file, the DEFAULTNET statement immediately follows my GATEWAY statement, without any statements in between them (other than comments of course!).
Have you tried removing that extra unneeded statement?
... if I execute the OBEYFILE, an IFI004 CP Abend will immediately follow. Same on both zVM54 and zVM71.
I am not familiar with this.
Can you please post a copy of this OBEYFILE thing of yours along with instructions on how I should use it? Thanks!
That's all for now.
Something else:
Do you have any type of OSTAILOR
statement in your Hercules configuration file? Is it set to VM
? Or some other value? It should be OSTAILOR VM
.
To be safe, may I see your ENTIRE Hercules configuration file please?
May I also see your ENTIRE Hercules logfile when the IFI004 occurs?
Please upload or attach them both to your next GitHub Issue comment please. Thanks.
Another thing:
Have you tried entering the Hercules commands t+E200
and t+E201
immediately before you try shutting down TCPIP? (i.e. immediately before you do whatever it is that causes your IFI004?) This will create a Hercules trace of all I/O to your CTCI device that might provide a clue as to what may be going wrong.
Thanks.
That's all I can think of right now.
Fish, I will get all the information together and send you. I may not get to the task for a few days. Geoff
No hurry.
Fish,
Here is the information you requested. I am using a zVM5.4 image:
I use DEDICATE statements in the VM Directory for the CTCA instead of using DTCPARMS so that the TCPIP machine can run without the need of A and B privilege classes.
The statement in the TCPIP config between GATEWAY and DEFAULTNET is to support the routing to the outside world, this one is for the first hop.
The OBEY file is named KILLTCP TCPIP (fm). Its contents are STOP CTCDEV1. It resides on TCPMAINT's 592 disk. It is activated by NETSTAT KILLTCP TCPIP B from a user which is authorized in the TCPIP config to issue an OBEYFILE. The filemode is based upon the way that the TCPIP minidisks are accessed.
OSTAILOR is set to VM
The Hercules configuration follows and the log will be uploaded:
#--------------------------------------------------------
# CPU Configuration (for zVM5.4)
#----------------------------------------------------------
ARCHLVL z/ARCH # Set Architecture level
FACILITY ENABLE bit44
FACILITY ENABLE BIT45
FACILITY ENABLE BIT37
#FACILITY ENABLE SIGP_SETARCH_S370
FACILITY QUERY ALL
#CAPPING OFF
#CPUIDFMT BASIC
CPUMODEL 789A # CPU Model Identification (NETID)
CPUSERIAL 002855 # CPU Serial Identification
CPUVERID 00 # CPU Version Identification
#DEFSTORE MAIN 2G # Define storage
MODEL EMULATOR # STSI returned model
PLANT ZZ # STSI returned plant
MANUFACTURER HRC # STSI returned manufacturer
LPARNAME HERCULES # DIAG 204 returned lparname
ENGINES CP # CPU types
MAINSIZE 2048M # Main storage size in megabytes
XPNDSIZE 0 # Expanded storage size in megabytes
MAXCPU 1 # Maximum number of CPUs
NUMCPU 1 # Number of CPUs
MAXRATES 1440 # Maximum mips/sio rate
ECPSVM NO # VM Assist : NO or Level (20 recommend)
LPARNUM 00 # LPAR ID
TIMERINT 50 # Interval Timer update in microsec
TODDRAG 1.0000 # TOD clock drag factor
#----------------------------------------------------------
# OS Tailoring
#----------------------------------------------------------
PGMPRDOS LICENSED # Allow
LOADPARM 001F
OSTAILOR VM # OS tailoring
SYSEPOCH 1900 # Base year for initial TOD clock
TZOFFSET +0000 # Using UTC (GMT)
#IODELAY 1000 # I/O Delay
#LOGOPTS TIMESTAMP # Timestamp all log messages
#----------------------------------------------------------
# Hercules Service Processor and Hercules Application Window
#----------------------------------------------------------
# MODPATH /usr/local/lib/hercules # Where to search for modules
# LDMOD dyninst tcpip # Modules to be loaded
#----------------------------------------------------------
# Hercules Service Processor and Hercules Application Window
#----------------------------------------------------------
CODEPAGE default # CodePage conversion table
SHCMDOPT DISABLE NODIAG8 # No shell commands from
#DIAG8CMD ENABLE ECHO # OS may issue commands via DIAG 8
DIAG8CMD DISABLE NOECHO
# HTTPPORT 8081 noauth userid password # HTTP server port
# HTTPROOT /usr/local/share/hercules/ # HTTP root directory
PANOPT RATE=400 # Panel refresh rate in msec
#----------------------------------------------------------
# Hercules 3270 Consoles
#----------------------------------------------------------
CNSLPORT 3270 # TCP port number to which consoles connect
CONKPALV (3,1,10) # Console recovery periods.
#----------------------------------------------------------
# Advanced Hercules Performance Tailoring
#
# +----------------------------------------------------------+
# | Caution: Modification of these parameters may adversely |
# | affect the performance of the host system |
# | and/or Hercules. |
# +----------------------------------------------------------+
#----------------------------------------------------------
# HERCPRIO 5 # Hercules process runs at Normal priority
# CPUPRIO 2 # CPU thread(s) run at Low priority
# SRVPRIO 4 # TCP/IP Server Thred(s), HTTP, CONSOLE, etc.
# DEVPRIO 8 # Device thread(s) run at Below Normal priority
# TODPRIO -20 # TOD Clock and timer thread are Time Critical
#----------------------------------------------------------
# Integrated Hercules I/O Controller
#
# DEVTMAX 8 # Device threads: -1=exit on complete, 0=unlimited
# SHRDPORT 3990 # TCP port number for sharing DASD images on
# this instance of Hercules (inactive)
# .-----------------------Device number
# | .-----------------Device type
# | | .---------File name and parameters
# | | |
# V V V
# ---- ---- --------------------
0000 SYSG SYSGCONS
#0009 3215-C / noprompt
000C 3505 /home/maint/hercules/reader00C.txt ascii trunc
000D 3525 /home/maint/hercules/punch00D.txt ascii
000E 1403 /home/maint/hercules/print00E.txt crlf
001F 3270
0020 3270
0021 3270
0022 3270
0023 3270
0024 3270
0025 3270
0026 3270
0027 3270
0028 3270
0029 3270
002A 3270
002B 3270
002C 3270
002D 3270
002E 3270
002F 3270
0380 3490 /home/maint/hercules/3490.TAPE0.aws
0381 3490 /home/maint/hercules/3490.TAPE1.aws
#0580 3490 /cdrom/tapes/*
#0581 3490 /home/maint/Downloads/cpdvd/540vm.ins
#0582 3490 /home/maint/Downloads/cpdvd/
#0583 3490 /home/maint/Downloads/zvm54rsu.iso
0123 3390 /run/media/maint/VMDASD/540RES.0123
0124 3390 /run/media/maint/VMDASD/540SPL.0124
0125 3390 /run/media/maint/VMDASD/540PAG.0125
0126 3390 /run/media/maint/VMDASD/540USR.0126
0127 3390 /run/media/maint/VMDASD/540WK1.0127
0128 3390 /run/media/maint/VMDASD/54TDSK.0128
#E200-E201 CTCI /dev/net/tun 1500 192.168.100.1 192.168.100.2 255.255.255.0
E200.2 CTCI /dev/net/tun 1500 192.168.100.1 192.168.100.2 255.255.255.0
zvm54cpa.log
:Respectfully, Geoff
Will be sending a hercules trace of E200-E201 CTCA pair.
Geoff
As I recall, I had much trouble with the z/VM 5.3 Evaluation Edition when 2GB or more of MAINSIZE was specified in Hercules. I had to reduce my Hercules MAINSIZE to a value less than 2GB (e.g. 2000MB or 1900MB, etc):
MAINSIZE 1900M # Main storage size in megabytes
Have you tried that yet?
Also, it there a reason why you're using CTCI instead of LCS or QETH/OSA for your networking? Just curious! I've personally always used OSA for all of my networking needs, including z/VM. Sometimes I use LCS, but I rarely use CTCI. Have you tried using LCS yet? Does it behave any better?
- I use DEDICATE statements in the VM Directory for the CTCA instead of using DTCPARMS so that the TCPIP machine can run without the need of A and B privilege classes.
Have you tried adding privilege classes A and B to TCPIP? As far as I know ABG are the standard default privilege classes for TCPIP. I personally would be very reluctant to remove privilege classes from users which IBM feels needs them. <shrug>
- The statement in the TCPIP config between GATEWAY and DEFAULTNET is to support the routing to the outside world, this one is for the first hop.
Isn't your DEFAULTNET statement accomplishing the same thing? I mean, I could understand your argument if the IP addresses (i.e. networks, i.e. subnets) were different (thereby requiring special routes), but since they're the same I don't think it's needed. My system doesn't have any such statement(s) and my z/VM system can reach the outside world just fine. <shrug>
I also noticed in the Hercules log you attached:
18:38:34 HHC00109E set_thread_priority( 5 ) failed: Operation not permitted 18:38:34 HHC00007I Previous message from function 'impl' at impl.c(988) 18:38:34 HHC00110W Defaulting all threads to priority 0 18:38:34 HHC00007I Previous message from function 'impl' at impl.c(991) 18:38:34 HHC00100I Thread id 00007f893987b740, prio 0, name 'impl_thread' started 18:38:34 HHC90020W 'hthread_setschedparam()' failed at loc=impl.c:1010: rc=22: Invalid argument 18:38:34 HHC00007I Previous message from function 'hthread_set_thread_prio' at hthreads.c(1170) . . . 18:38:34 HHC00018W Hercules is NOT running in elevated mode 18:38:34 HHC00007I Previous message from function 'impl' at impl.c(1045) 18:38:34 HHC00150I Crypto module loaded (C) Copyright 2003-2016 by Bernard van der Helm 18:38:34 HHC00151I Activated facility: Message Security Assist 18:38:34 HHC00151I Activated facility: Message Security Assist Extension 1, 2, 3 and 4 18:38:34 HHC90020W 'hthread_setschedparam()' failed at loc=impl.c:431: rc=22: Invalid argument 18:38:34 HHC00007I Previous message from function 'hthread_set_thread_prio' at hthreads.c(1170) 18:38:34 HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0) 18:38:34 HHC90020W 'hthread_setschedparam()' failed at loc=cpu.c:2328: rc=22: Invalid argument 18:38:34 HHC00007I Previous message from function 'hthread_set_thread_prio' at hthreads.c(1170) 18:38:34 HHC00100I Thread id 00007f893932b700, prio 0, name 'Processor CP00' started 18:38:34 HHC90020W 'hthread_setschedparam()' failed at loc=timer.c:192: rc=22: Invalid argument . . .
On Linux, Hercules requires root in order to be able to properly set the dispatching priority of its internal threads. Is there a reason why you're not starting Hercules with elevated (root) privileges?
As I understand it, doing so should be safe since Hercules only uses it when needed (such as when setting thread priorities) but otherwise runs without root privileges. (It drops root almost as soon as it starts and only regains it again for those brief moments when it needs it, such as when setting thread priorities.)
These are just some things to try. You don't have to implement all of them. The one I would personally suggest trying first is reducing your Hercules MAINSIZE setting to 1900MB (i.e. something less than 2GB). I'm hoping that might resolve your problem altogether.
If that doesn't work, then you might try changing your networking device to LCS instead. That also might very well resolve your problem.
And finally, I don't recall asking this before (and please forgive me for asking it, but I'm just naturally incredibly curious!):
I there a good reason WHY you need to be stopping/re-starting TCPIP?!?
I mean, to me, once the system comes up and networking is initialized, why f**k with it? :)
P.S. What type of machine is a model 789A
?! I've never heard of such a thing before! What is it? Have you tried specifying a z800 or z900, etc, CPU model number instead?
P.P.S The FACILITY
statements you have in your configuration file are not needed. Facilities 37, 44 and 45 are all already enabled by default. They are accomplishing nothing.
Another suggestion: rather than defining each 3270 device individually in your configuration file, you might condier using the device number range/count syntax instead, e.g.:
001F-002F 3270
Or:
001F 3270
0020.16 3270
That won't fix your problem of course, but it will make maintaining your configuration file a little bit easier.
Will be sending a hercules trace of E200-E201 CTCA pair.
Still waiting. (NO RUSH! Just thought I'd mention it since it's been a few days since you first wrote that.)
Also, have you had time to try any of my suggested experiments yet?
Does doing any of the above resolve your IFI004
problem?
Geoff, (@gjcullen)
Upon reviewing this issue, I happened to notice that I only tested the #CP EXT
technique of stopping TCPIP (which worked fine), but I had not yet tried your "OBEY" file technique (which I admit to not being familiar with).
So I decided to give it a try, but was unfortunately unable to do so due to an apparent z/VM (configuration? installation?) error.
In an earlier comment, you said:
- The OBEY file is named KILLTCP TCPIP (fm). Its contents are STOP CTCDEV1. It resides on TCPMAINT's 592 disk. It is activated by NETSTAT KILLTCP TCPIP B from a user which is authorized in the TCPIP config to issue an OBEYFILE. The filemode is based upon the way that the TCPIP minidisks are accessed.
So the first thing I of course did, was to try to logon to TCPMAINT. But I couldn't! Each time I would try, I would get the following error:
Enter one of the following commands: LOGON userid (Example: LOGON VMUSER1) DIAL userid (Example: DIAL VMUSER2) MSG userid message (Example: MSG VMUSER2 GOOD MORNING) LOGOFF logon tcpmaint by ibmvm1 ENTER PASSWORD (IT WILL NOT APPEAR WHEN TYPED): z/VM Version 7 Release 1.0, Service Level 1901 (64-bit), built on IBM Virtualization Technology There is no logmsg data FILES: 0038 RDR, NO PRT, NO PUN LOGON AT 05:48:19 PDT SATURDAY 03/27/21 DMSACC724I 19E replaces Y (19E) DMSACP723I Y (19E) R/O DMSPAR280E Application not active 05:48:19 * MSG FROM TCPMAINT: DMSINS2563T RTNLOAD failed for VMMTLIB HCPGIR450W CP entered; disabled wait PSW 000A0000 80021528
And trying to re-ipl CMS does no good either:
ipl cms DMSACC724I 19E replaces Y (19E) DMSACP723I Y (19E) R/O DMSPAR280E Application not active 06:27:35 * MSG FROM TCPMAINT: DMSINS2563T RTNLOAD failed for VMMTLIB HCPGIR450W CP entered; disabled wait PSW 000A0000 80021528
I find this to be quite interesting, and very possibly a strong clue(?) perhaps as to where/what the actual root problem might be? *`()`**
So I now have to ask the obvious question: Does this same thing happen to you? Are you able to logon to your TCPMAINT userid on your own z/VM system? (e.g. z/VM 7.1, but knowing whether or not the problem also occurs on z/VM 5.4 too would also be interesting to know.)
Because without being able to logon to TCPMAINT, I am unable to try your "OBEY" file technique of shutting down TCPIP.
Sorry! :(
*`()`** Unfortunately however, being very unskilled at modern z/VM, I have no freaking clue as to where the problem might be nor how to go about fixing it. Maybe you or someone else more skilled with modern z/VM can figure it out?
UPDATE:
Apparently I am unable to logon to ANY z/VM userid at all!! So far I've tried MAINT710, plain old MAINT, and even ZOS, and they ALL encounter the same error, dropping me into a disabled wait. :(
Another quick update:
I did some quick Googling and found the following:
Does this make sense to anyone?
Sounds to me like one of your VM disk images has become corrupted, in particular the image containing the S-disk minidisk. I would suggest restoring your VM system.
Sounds to me like one of your VM disk images has become corrupted ...
I find that to be highly unlikely since I use shadow files, starting each test from the exact same known-to-be-stable(?) starting point each time.
Nevertheless, since I admit to having never actually tested logging onto any z/VM userid after having modified my existing LCS configuration to use CTCI instead, I suppose there is the possibility that I missed something (did something wrong) and it has always been broken. *`()`**
So I'll revert back to my known working LCS configuration and make the same changes again as I did last time to convert it into a CTCI configuration instead, but this time verifying afterwards that logging onto z/VM userids still works.
Thanks for the feedback.
*`()` _This is an ADCD version of z/VM 7.1 that came pre-configured for OSA after all, and I admit to never having ever tried to manually change it to using a completely different networking device before. As I recall, whenever I originally installed my z/VM 6.3 system (which was not an ADCD), I had to specify at install time** what type of networking device I wanted to use, thus leading me to believe that one cannot simply switch to using a completely different networking device without knowing exactly what one was doing (i.e. without manually modifying a lot of rather gnarly TCPIP related files spread all over the place)._
Rather, what I did with my z/VM 6.3 system, is install it specifying CTCI as my network device, verify it worked okay, then save those shadow files and then reset back to the original virgin uninstalled state and start another brand new install again from the very beginning, this time specifying LCS as my networking device, saving those shadow files, etc, for each type of installation type that z/VM supported (i.e. CTCI, LCS, OSA in level-2 mode and OSA in level-3 mode). I guess I should have used my existing z/VM 6.3 CTCI system for my tests rather than to have tried manually re-configuring my existing pre-install z/VM 7.1 system like I did. That was probably my mistake. I'll try to recreate Geoff's crash using my existing z/VM 6.3 CTCI system instead this time. In retrospect it was a mistake for me to try and do it with my z/VM 7.1 system.
Okay, z/VM 6.3 works fine using CTCI.
I logged onto user TCPMAINT and created a "KILLTCP TCPIP F" file (vaddr 592 was my F drive) consisting of a single line:
STOP CTCDEV1
But when I try using it (in this case from userid MAINT630), I get an error:
msg * xxx
11:12:51 * MSG FROM MAINT630: XXX
Ready; T=0.01/0.01 11:12:51
q disk *
LABEL VDEV M STAT CYL TYPE BLKSZ FILES BLKS USED-(%) BLKS LEFT BLK TOTAL
MNT191 191 A R/W 175 3390 4096 47 411-01 31089 31500
MNT5E6 5E6 B R/W 9 3390 4096 134 1304-80 316 1620
MNT2CC 2CC C R/W 10 3390 4096 5 184-10 1616 1800
MNT51D 51D D R/W 26 3390 4096 297 1821-39 2859 4680
PMT551 551 E R/W 40 3390 4096 10 119-02 7081 7200
MNT190 190 S R/O 207 3390 4096 701 18090-49 19170 37260
TCM592 592 T R/O 140 3390 4096 858 11816-47 13384 25200
MNT193 193 W R/O 500 3390 4096 1280 42715-47 47285 90000
MNT19D 19D X/X R/O 292 3390 1024 404 61228-42 83312 144540
MNT19E 19E Y/S R/O 500 3390 4096 1123 30403-34 59597 90000
PMTCF0 CF0 Z R/W 120 3390 4096 3 23-01 21577 21600
Ready; T=0.01/0.01 11:12:56
type killtcp tcpip t
STOP CTCDEV1
Ready; T=0.01/0.01 11:13:07
netstat killtcp tcpip t
DTCNET103E Invalid parameter: KILLTCP
Ready(00001); T=0.03/0.04 11:13:19
netstat obey killtcp tcpip t
VM TCP/IP Netstat Level 630 TCP/IP Server Name: TCPIP
RDR FILE 0032 SENT FROM TCPIP PUN WAS 0013 RECS 0006 CPY 001 A NOHOLD NOKEEP
DTCNET501E Configuration error. Details are in PROFILE TCPERROR.
Ready(00004); T=0.06/0.07 11:13:27
type profile tcperror
DTCPRS008E Error encountered parsing OBEY or generated command line: KILLTCP TC
PIP
DTCPAR123I Unknown command: 'KILLTCP'
DTCPRS332E Remainder of command line skipped due to error
DTCPRS096I Options enabled after parsing OBEY command line:
Ready; T=0.01/0.01 11:13:42
I'm obviously doing something wrong, but I don't know what!
This is all new to me, and I don't know what I'm doing! :(
Fish,
I have attached the Hercules trace (t+E200 and t+E201) for the CTCs, along with the Hercules configuration file (zvm54cpa.txt).
The trace was initiated just prior to having an authorized user machine entering the command: NETSTAT OBEY STOP CTCDEV1
. (Issuing the STOP CTCDEV1 command directly instead of via an OBEYFILE is much simpler.)
The IFI004 ensued immediately:
Geoff
Fish,
Modify the OBEY statement in the TCPIP profile that you are running.
Add your userid or any userid from which you will issue the stop.
Recycle TCPIP to enable the new OBEY statement.
On the authorized userid type: NETSTAT OBEY STOP CTCDEV1 , assuming CTCDEV1 is the name of the connection.
This method is easier to reproduce the problem (if it does happen on your system).
Geoff
On 03/27/2021 02:17 PM, Fish-Git wrote:
Okay, z/VM 6.3 works fine using CTCI.
I logged onto user TCPMAINT and created a "KILLTCP TCPIP F" file (vaddr 592 was my F drive) consisting of a single line:
|STOP CTCDEV1 |
But when I try using it (in this case from userid MAINT630), I get and error:
|msg xxx 11:12:51 MSG FROM MAINT630: XXX Ready; T=0.01/0.01 11:12:51 q disk * LABEL VDEV M STAT CYL TYPE BLKSZ FILES BLKS USED-(%) BLKS LEFT BLK TOTAL MNT191 191 A R/W 175 3390 4096 47 411-01 31089 31500 MNT5E6 5E6 B R/W 9 3390 4096 134 1304-80 316 1620 MNT2CC 2CC C R/W 10 3390 4096 5 184-10 1616 1800 MNT51D 51D D R/W 26 3390 4096 297 1821-39 2859 4680 PMT551 551 E R/W 40 3390 4096 10 119-02 7081 7200 MNT190 190 S R/O 207 3390 4096 701 18090-49 19170 37260 TCM592 592 T R/O 140 3390 4096 858 11816-47 13384 25200 MNT193 193 W R/O 500 3390 4096 1280 42715-47 47285 90000 MNT19D 19D X/X R/O 292 3390 1024 404 61228-42 83312 144540 MNT19E 19E Y/S R/O 500 3390 4096 1123 30403-34 59597 90000 PMTCF0 CF0 Z R/W 120 3390 4096 3 23-01 21577 21600 Ready; T=0.01/0.01 11:12:56 type killtcp tcpip t STOP CTCDEV1 Ready; T=0.01/0.01 11:13:07 netstat killtcp tcpip t DTCNET103E Invalid parameter: KILLTCP Ready(00001); T=0.03/0.04 11:13:19 netstat obey killtcp tcpip t VM TCP/IP Netstat Level 630 TCP/IP Server Name: TCPIP RDR FILE 0032 SENT FROM TCPIP PUN WAS 0013 RECS 0006 CPY 001 A NOHOLD NOKEEP DTCNET501E Configuration error. Details are in PROFILE TCPERROR. Ready(00004); T=0.06/0.07 11:13:27 type profile tcperror DTCPRS008E Error encountered parsing OBEY or generated command line: KILLTCP TC PIP DTCPAR123I Unknown command: 'KILLTCP' DTCPRS332E Remainder of command line skipped due to error DTCPRS096I Options enabled after parsing OBEY command line: Ready; T=0.01/0.01 11:13:42 |
I'm obviously doing something wrong, but I don't know what! This is all new to me, and I don't know what I'm doing! |:(|
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SDL-Hercules-390/hyperion/issues/375#issuecomment-808780440, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR2CVIJ3AT4JZICZ3AZFH5LTFYOMNANCNFSM4YV6VIBA.
Geoff,
I just tried your new test (using netstat obey stop ctcdev1
command) just now, and my z/VM 6.3 system remained up and running just fine! <shrug>
There must be something unusual about your particular z/VM system (a misconfiguration perhaps?) that is causing your IFI004, but I don't think it is Hercules. Whenever I try the exact same thing with my z/VM system (admittedly different from yours; mine is 6.3 whereas yours is 5.4), it works fine. There is no IFIF004 and my system remains up and running just fine.
I can stop and start my CTCI device at will (multiple times, whenever I want), and my system keeps chugging along just fine.
Of course, it could still be a bug in Hercules I suppose, but if it is, it's a bug specific to Linux tuntap handling, because on Windows. it works just fine. _(I'm not skilled enough to try the same thing on my KDE Neon 5.20.5 VMware virtual machine; I only use it for testing building Hercules, not for any serious *Nix Hercules debugging, as I don't have the Linux skills for that.)_
If you wish to pursue this problem/issue further, I'm afraid you are going to have to solicit the help of some other Hercules developer who is more experienced with Linux tuntap networking, not me. I don't have the necessary skills to debug this issue any further.
With that in mind, since I honestly do feel the problem lies with your z/VM guest system itself and not with Hercules, I am going to close this issue as "Unable to Reproduce".
If this does not satisfy you, please feel free to re-open this issue and ask a different Hercules developer to help you.
Sorry! :(
I'm seeing this on VM/ESA 2.4 and OS/390 2.10, as well. In my case, I get the IFI004 whenever I stop TCP/IP on either VM or OS/390 running inside it. This is on my desktop Linux system. I use CTCI instead of LCS or OSA or whatever because using a tap interface on that box disrupts other things I do. (It screws up the Ethernet MAC address that programs see, and one in particular gets really confused when the MAC address changes.)
How would you suggest I approach it, not being a VM guy? I'm setting this system up to make sure I remember all the details of VTAM networking between the two systems, and also figuring out how to set up PVM and RSCS-JES2/NJE connections.
I'm still thinking this is due to a spurious interrupt (or lack thereof - although that would only cause the MIH procedure to be invoked) during HDV/HIO/HSCH - anyhow, any one of those functions that signals I/O termination. the new "CTCE" device type should be appropriate for PVM, RSCS/NJE, VTAM PU5 to PU5 (don't forget to enter the Enterprise key for CDRM to work) - also works for more modern z/VM Single System Image (the z/VM cluster thingy)
These abends became regular occurrences for me following commit 5eb60c6 made by Fish on 3 Nov 2017, titled "Fix LCS, CTCI, CTCE and PTP Halt/Clear Subchannel handling". I have tried gathering evidence several times, but turning on traces anywhere naturally ensured the abend did not occur. I experience various symptoms, sometimes VM abends when I shutdown VM, sometimes VTAM abends when I shutdown z/OS, and sometimes VM hangs at the writing the checkpoint stage of shutdown.
I have just migrated a zVM54 image from Hercules available as an rpm under Fedora Linux 31 to a very recent Hyperion version (Hercules version 4.4.9999.0-SDL-g3062c6ea (4.4.9999.0) ) that I freshly downloaded, compiled and linked, from GitHub. Thank you.
I as yet have experienced only a single problem with the change. When TCPIP under zVM54 is brought down using the external interruption command:
the zVM54 abnormally terminates with an
IFI004
and system dump. After a number of experiments I have found that the problem can be triggered by the Hercules configuration definition of the Channel-to-Channel Adapter line (CTCI).When the line in the configuration reads:
I will enjoy connectivity across the emulated CTC with the outside network, but do get the
IFI004
Abend when TCPIP is shutdown.When the line in the configuration reads:
I will have NO connectivity across the CTC, but I will have no problems during the TCPIP shutdown.
The problem is easily reproducible.
Other Information:
Any help or suggestions are appreciated
Respectfully, Geoff