zowe / zowe-install-packaging

Packaging repository for the Zowe install scripts and files
Eclipse Public License 2.0
26 stars 52 forks source link

Redirecting ZWESVSTC STC JCL to write STDERR and STDOUT to USS causes Zowe to crash #1446

Open Joe-Winchester opened 4 years ago

Joe-Winchester commented 4 years ago

Describe the bug

Edit ZWESVSTC so that the following lines are commented out

//STDOUT   DD SYSOUT=*
//STDERR   DD SYSOUT=*

and the following are uncommented

//*STDOUT   DD PATH='/tmp/zowe.std.out',
//*            PATHOPTS=(OWRONLY,OCREAT,OTRUNC),
//*            PATHMODE=SIRWXU
//*STDERR   DD PATH='/tmp/zowe.std.err',
//*            PATHOPTS=(OWRONLY,OCREAT,OTRUNC),
//*            PATHMODE=SIRWXU

Start Zowe and it will end early. The following is in the /tmo/zowe.std.err file

+ export ZOWE_APIM_VERIFY_CERTIFICATES 
+ read -r line 
+ /shared/zowe/bin/internal/run-zowe.sh -c /MVS1/var/zowe 
chmod: FSUM6180 file "/MVS1/var/zowe/workspace/backups/backup_configuration.20.06.15.14.34.36.cfg": EDC5139I Operation not permitted. (errno2=0xEF5F603D)
chmod: FSUM6180 file "/MVS1/var/zowe/workspace/backups/backup_configuration.20.06.15.15.01.02.cfg": EDC5139I Operation not permitted. (errno2=0xEF5F603D)
CEE5210S The signal SIGHUP was received.
CEE5210S The signal SIGHUP was received.
CEE5210S The signal SIGHUP was received.
CEE5210S The signal SIGHUP was received.
CEE5210S The signal SIGHUP was received.
John-A-Davies commented 4 years ago

I made a video recording of this problem on ukzowe3. I reproduced the problem. I want to probe this some more.

John-A-Davies commented 4 years ago

https://ibm.webex.com/webappng/sites/ibm/meeting/postinfo/72D3F230DCF11DF3E0535006FC0A96B3_I_164379142596340078?from_login=true password is 2WbhnCYt

John-A-Davies commented 4 years ago

The server stayed up when I removed GATEWAY from the components to be started. LAUNCH_COMPONENT_GROUPS=DESKTOP in which case, you get only 2 SIGHUP messages in STDERR:

chmod: FSUM6180 file "/u/tstradm/zowe/instance8/workspace/backups/backup_configuration.20.06.18.10.11.34.cfg": EDC5139I Operation not permitted. 
CEE5210S The signal SIGHUP was received.                                                                                                         
CEE5210S The signal SIGHUP was received.                                                                                                         

and the server stays up

SDSF DA S0W1     S0W1     PAG  0  CPU/L     2/***
COMMAND INPUT ===>                               
NP   JOBNAME  StepName ProcStep JobID    Owner   
     ZWE1DS1  STEP1             STC03209 ZWESVUSR
     ZWE1DS1  STEP1             STC03252 ZWESVUSR
     ZWE1DS1  STEP1             STC03245 ZWESVUSR
     ZWE1SV6  STEP1             STC03254 ZWESVUSR
     ZWE1SV7  STEP1             STC03250 ZWESVUSR
     ZWE1SV8  STEP1             STC03247 ZWESVUSR
     ZWE1SV9  STEP1             STC03243 ZWESVUSR
John-A-Davies commented 4 years ago

... except that the main Zowe task has ended after 30 seconds, orphaning the others:

07:05:03.19 MYCONS   00000290  S ZWESVSTC,INSTANCE='/u/tstradm/zowe/instance8',JOBNAME=ZWE1SV,           
                               MSGCLASS=X                                                                
07:05:03.27 STC03257 00000281  $HASP100 ZWE1SV   ON STCINRDR                                             
07:05:03.33 STC03257 00000290  IEF695I START ZWESVSTC WITH JOBNAME ZWE1SV   IS ASSIGNED TO USER          
                               ZWESVUSR, GROUP ZWEADMIN                                                  
07:05:03.33 STC03257 00000090  $HASP373 ZWE1SV   STARTED                                                 
07:05:05.21 STC03243 00000290  IEA631I  OPERATOR MYCONS   NOW INACTIVE, SYSTEM=S0W1    , LU=0955BCE4     
07:05:32.32 STC03257 00000090  $HASP395 ZWE1SV   ENDED - RC=0000                                         
07:05:32.36          00000281  IEA989I SLIP TRAP ID=X33E MATCHED.  JOBNAME=*UNAVAIL, ASID=0058.          
07:05:32.37 STC03257 00000281  $HASP250 ZWE1SV PURGED -- (JOB KEY WAS D816BB57)                          
John-A-Davies commented 4 years ago

Another demo. This time, showing startup of only GATEWAY, then DESKTOP, and using MSGCLASS=X on the main server job. https://ibm.webex.com/webappng/sites/ibm/recording/play/c3ffdf168b644fd8b1758a3d42e72aa0 password: dCxwTDB2

John-A-Davies commented 4 years ago

CW Cheung 13 minutes ago nodejs does not use STDOUT, STDIN or STDERR DD. There has to be something else in between that provides this functionality.

John-A-Davies commented 4 years ago

For reference, here is the JCL JOB

 SDSF OUTPUT DISPLAY ZWE1SV   STC03274  DSID     4 LINE 1       COLUMNS 02- 161            
 COMMAND INPUT ===>                                            SCROLL ===> CSR             

                       J E S 2  J O B  L O G  --  S Y S T E M  S 0 W 1  --  N O D E  S 0 W 1                         

10.04.52 STC03274 ---- THURSDAY,  18 JUN 2020 ----                                                                   
10.04.52 STC03274  IEF695I START ZWESVSTC WITH JOBNAME ZWE1SV   IS ASSIGNED TO USER ZWESVUSR, GROUP ZWEADMIN         
10.04.52 STC03274  $HASP373 ZWE1SV   STARTED                                                                         
10.05.20 STC03274  $HASP395 ZWE1SV   ENDED - RC=0000                                                                 
        1 //ZWE1SV   JOB MSGLEVEL=1                                               STC03274                           
        2 //STARTING EXEC ZWESVSTC,INSTANCE='/u/tstradm/zowe/instance8'                                              
          XX********************************************************************                                     
          XX* This program and the accompanying materials are made available   *                                     
          XX* under the terms of the Eclipse Public License v2.0 which         *                                     
          XX* accompanies this distribution, and is available at               *                                     
          XX* https://www.eclipse.org/legal/epl-v20.html                       *                                     
          XX*                                                                  *                                     
          XX* SPDX-License-Identifier: EPL-2.0                                 *                                     
          XX*                                                                  *                                     
          XX* Copyright IBM Corporation 2018, 2019                             *                                     
          XX********************************************************************                                     
          XX*                                                                  *                                     
          XX* ZOWE SERVER PROCEDURE                                            *                                     
          XX*                                                                  *                                     
          XX* This is a procedure to start the Node servers, API Mediation     *                                     
          XX* and explorers                                                    *                                     
          XX*                                                                  *                                     
          XX* Invoke this procedure, specifying the root path where the        *                                     
          XX* ZOWE server is installed on your system.                         *                                     
          XX*                                                                  *                                     
          XX*   S ZWESVSTC,INSTANCE='{{instance_directory}}'                   *                                     
          XX*                                                                  *                                     
          XX*                                                                  *                                     
          XX********************************************************************                                     
        3 XXZWESVSTC   PROC INSTANCE='{{instance_directory}}'                                                        
          XX*-------------------------------------------------------------------                                     
          XX* INSTANCE - The path to the HFS directory where the                                                     
          XX*            zowe instance was created                                                                   
          XX*-------------------------------------------------------------------                                     
        4 XXEXPORT EXPORT SYMLIST=*                                                                                  
        5 XXZOWESTEP EXEC PGM=BPXBATSL,REGION=0M,TIME=NOLIMIT,                                                       
          XX  PARM='PGM /bin/sh &INSTANCE/bin/internal/run-zowe.sh'                                                  
          XX*STDOUT   DD SYSOUT=*                                                                                    
          XX*STDERR   DD SYSOUT=*                                                                                    
          XX*-------------------------------------------------------------------                                     
          XX* Optional logging parameters that can be configured if required                                         
          XX*-------------------------------------------------------------------                                     
          IEFC653I SUBSTITUTION JCL - PGM=BPXBATSL,REGION=0M,TIME=NOLIMIT,PARM='PGM /bin/sh                          
          /u/tstradm/zowe/instance8/bin/internal/run-zowe.sh'                                                        
        6 XXSTDOUT   DD PATH='&INSTANCE/logs/zowe.svr.stdout',                                                       
          XX            PATHOPTS=(OWRONLY,OCREAT,OTRUNC,OSYNC),                                                      
          XX            PATHMODE=(SIRWXU,SIRWXG,SIRWXO)                                                              
          IEFC653I SUBSTITUTION JCL - PATH='/u/tstradm/zowe/instance8/logs/zowe.svr.stdout',PATHOPTS=(OWRONLY,OCREAT,
          OTRUNC,OSYNC),PATHMODE=(SIRWXU,SIRWXG,SIRWXO)                                                              
        7 XXSTDERR   DD PATH='&INSTANCE/logs/zowe.svr.stderr',                                                       
          XX            PATHOPTS=(OWRONLY,OCREAT,OTRUNC,OSYNC),                                                      
          XX            PATHMODE=(SIRWXU,SIRWXG,SIRWXO)                                                              
          IEFC653I SUBSTITUTION JCL - PATH='/u/tstradm/zowe/instance8/logs/zowe.svr.stderr',PATHOPTS=(OWRONLY,OCREAT,
          OTRUNC,OSYNC),PATHMODE=(SIRWXU,SIRWXG,SIRWXO)                                                              
 STMT NO. MESSAGE                                                                          
        2 IEFC001I PROCEDURE ZWESVSTC WAS EXPANDED USING SYSTEM LIBRARY ADCD.Z23B.PROCLIB  
IEF695I START ZWESVSTC WITH JOBNAME ZWE1SV   IS ASSIGNED TO USER ZWESVUSR, GROUP ZWEADMIN  
IEFA111I ZWE1SV IS USING THE FOLLOWING JOB RELATED SETTINGS:                               
         SWA=ABOVE,TIOT SIZE=32K,DSENQSHR=DISALLOW,GDGBIAS=JOB                             
IEF236I ALLOC. FOR ZWE1SV ZWE1SV                                                           
IGD103I SMS HFS FILE ALLOCATED TO DDNAME STDOUT                                            
IGD103I SMS HFS FILE ALLOCATED TO DDNAME STDERR                                            
IEF142I ZWE1SV ZWE1SV - STEP WAS EXECUTED - COND CODE 0000                                 
IGD104I HFS FILE WAS RETAINED, DDNAME IS (STDOUT  )                                        
FILENAME IS (/u/tstradm/zowe/instance8/logs/zowe.svr.stdout)                               
IGD104I HFS FILE WAS RETAINED, DDNAME IS (STDERR  )                                        
FILENAME IS (/u/tstradm/zowe/instance8/logs/zowe.svr.stderr)                               
IEF373I STEP/ZOWESTEP/START 2020170.1004                                                   
IEF032I STEP/ZOWESTEP/STOP  2020170.1005                                                   
        CPU:     0 HR  00 MIN  00.28 SEC    SRB:     0 HR  00 MIN  00.00 SEC               
        VIRT:   116K  SYS:   168K  EXT:      484K  SYS:     9776K                          
        ATB- REAL:                    12K  SLOTS:                     0K                   
             VIRT- ALLOC:       7M SHRD:       0M                                          
IEF375I  JOB/ZWE1SV  /START 2020170.1004                                                   
IEF033I  JOB/ZWE1SV  /STOP  2020170.1005                                                   
        CPU:     0 HR  00 MIN  00.28 SEC    SRB:     0 HR  00 MIN  00.00 SEC               
******************************** BOTTOM OF DATA *******************************************
John-A-Davies commented 4 years ago

In ZWESVSTC.jcl, the SRVRPATH variable on STDOUT and STDERR USS DD statements

PATH='&SRVRPATH/std.out'

is wrong and will produce a JCL ERROR. It has not been updated since the introduction of INSTANCE. So SRVRPATH should be replaced by INSTANCE.

This is a separate problem which I had to fix before the JCL would run. It is not the cause of the server crash.

John-A-Davies commented 4 years ago

This problem does not occur if you instead direct STDOUT and STDERR to z/OS datasets.