hercules-390 / hyperion

Hercules 390
Other
253 stars 70 forks source link

Hercules hangs in exit processing #276

Open fbi-ranger opened 5 years ago

fbi-ranger commented 5 years ago

Hercules version: 4.2.0.0-SDL-gf8e1fa5b-modified (4.2.0.0) Host OS: openSUSE 15.0 Guest OS: z/VM 6.4 Symptom: Hercules hangs during exit processing.

Hercules console log:

HHC01603I shutdown system phoenix                                                                      HHC00160I SCP command: shutdown system phoenix                                                         23:34:47 SHUTDOWN SYSTEM PHOENIX                                                                       23:34:47 SYSTEM SHUTDOWN STARTED                                                                       23:34:48 HCPSHU2116I SHUTDOWN issued at 2019-07-15 23:34:48 by OPERATOR 23:34:49 HCPSHU960I System shutdown may be delayed for up to 90 seconds        23:34:49 HCPSIG2113I User VMSERVP has reported successful termination                   23:34:49 HCPSIG2113I User VMSERVU has reported successful termination                      23:34:49 HCPSIG2113I User VMSERVS has reported successful termination                           23:34:49 HCPSIG2113I User VMSERVR has reported successful termination                             HHC00809I Processor CP01: disabled wait state 0002000000000000 0000000000009003       23:34:49 HCPSIG2113I User VMNBCSX has reported successful termination                             HHC00814I Processor CP00: SIGP CPU reset                        (0C) CP01, PARM 00000000004E8AA8: CC   HHC00809I Processor CP02: disabled wait state 0002000000000000 0000000000009003       23:34:49 Processor 01 offline                                                                          HHC00814I Processor CP00: SIGP CPU reset                        (0C) CP02, PARM 00000000004E8AA8: CC   HHC00809I Processor CP03: disabled wait state 0002000000000000 0000000000009003       23:34:49 Processor 02 offline                                                                          HHC00814I Processor CP00: SIGP CPU reset                        (0C) CP03, PARM 00000000004E8AA8: CC   HHC00814I Processor CP00: SIGP Store status                     (0E) CP01, PARM 0000000000B2F200: CC   HHC00814I Processor CP00: SIGP Store status                     (0E) CP02, PARM 0000000000B31200: CC   HHC00814I Processor CP00: SIGP Store status                     (0E) CP03, PARM 0000000000B33200: CC   23:34:52 HCPWRP963I SHUTDOWN STEP USOAC - JOURNAL USER TERMINATION 23:34:52 HCPWRP963I SHUTDOWN STEP MFRSD - TERMINATE HARDWARE LOADER 23:34:53 HCPWRP963I SHUTDOWN STEP APISD - TERMINATE OTHER PROCESSORS 23:34:53 HCPWRP963I SHUTDOWN STEP ENASD - DISABLE TERMINAL DEVICES 23:34:54 HCPWRP963I SHUTDOWN STEP KCBSD - PERFORM ISFC SHUTDOWN TASKS                                  23:34:54 HCPWRP963I SHUTDOWN STEP ISHDN - SHUT DOWN I/O SUBSYSTEM                                      23:34:55 HCPWRP963I SHUTDOWN STEP TTRAL - TERMINATE CONCURRENT COPY SESSIONS                           23:34:55 HCPWRP963I SHUTDOWN STEP SGPST - STOP OTHER PROCESSORS                                       23:34:56 HCPWRP959I PHOENIX  SYSTEM TERMINATION IN PROGRESS ON 2019-07-15 23:34:56 HCPWRP963I SHUTDOWN STEP TXTDS - TERMINATE DATA TRACES      23:34:56 HCPWRP963I SHUTDOWN STEP SVACV - ACTIVATE TERMINATION SAVE AREAS 23:34:57 HCPWRP963I SHUTDOWN STEP CHMOF - DISABLE CHANNEL MEASUREMENT   23:34:57 HCPWRP963I SHUTDOWN STEP ISHDA - DISABLE ALL DEVICES                            23:34:58 HCPWRP963I SHUTDOWN STEP CKPSH - TAKE A CHECKPOINT                           23:34:58 HCPWRP963I SHUTDOWN STEP OPRCK - SAVE OPERATOR CONSOLE LIST         23:34:59 HCPWRP963I SHUTDOWN STEP MCWMD - DETERMINE MACHINE CHECK STATUS                               23:34:59 HCPWRP963I SHUTDOWN STEP SDVRS - RESET IBM DASD CU CHARACTERISTICS                            23:35:00 HCPWRP962I VM SHUTDOWN COMPLETED IN 11 SEC                                       23:35:00 HCPWRP963I SHUTDOWN STEP SVADV - DEACTIVATE TERMINATION SAVE AREAS                            23:35:00 HCPWRP961W SYSTEM SHUTDOWN COMPLETE FOR PHOENIX  ON 2019-07-15                                HHC00809I Processor CP00: disabled wait state 0002000000000000 0000000000000961                        HHC01603I quit                                                                                         HHC00101I Thread id 00007f4545c72700, prio -1, name 'http_server' ended HHC00101I Thread id 00007f4545f75700, prio -1, name 'Processor CP00' ended HHC00101I Thread id 00007f4546178700, prio -1, name 'Processor CP01' ended HHC00101I Thread id 00007f4546077700, prio -1, name 'Processor CP02' ended HHC00101I Thread id 00007f4545d73700, prio -1, name 'Processor CP03' ended HHC02915I client 1 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable HHC02915I client 2 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable HHC02915I client 3 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable HHC02915I client 4 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable HHC02915I client 5 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable HHC02915I client 6 COMM: Connection received HHC01030I Connection rejected, device 0700 unavailable

HHC02915I is shown because NONSNA device 700 (Console of z/OS 2.3 guest) retries to establish a connection to Hercules.

Hercules hangs after stopping the CP threads. Probably it is related to QETH device EC00-EC02, that was added lately. Definition in hercules.cnf:

EC00.3 QETH CHPID EC IPADDR 10.10.0.190 NETMASK 255.255.255.0

This device is an uplink port of a z/VM layer 3 vswitch:

/***/ DEFINE VSWITCH VMINTRA1 TYPE QDIO RDEV EC00 CONNECT IP MODIFY VSWITCH VMINTRA1 GRANT TCPIP

A vitual nic of guest s0w1 (z/OS 2.3) using this switch.

It seems that when a connection using this device is used, Hercules does not quit correctly.

Starting and exiting Hercules without IPL z/VM works as expected. Also simply IPL z/VM and immediately shutdown and then quitting Hercules seems to work as expected.

ps -ef | grep hercules

hercules 17905 17904 99 Jul15 pts/0 07:11:27 /local/sys1/z390/hercules/bin/hercules -f /local/sys1/z390/etc/hercules.cnf

Only kill -9 terminates that process.

/usr/local/bin/iml-hercules: line 3: 17905 Killed /local/sys1/$VERSION/hercules/bin/hercules -f /local/sys1/$VERSION/etc/hercules.cnf