zowe / community

Zowe Community - Sub-projects, Squads, Contribution Guidelines, Meeting Minutes, and more
53 stars 41 forks source link

Zowe 2.6.1 fails with ABENDS0C4-4 when starting #1852

Closed hockeyrob closed 1 year ago

hockeyrob commented 1 year ago

Describe the bug Pretty simple; Zowe 2.6.1 fails with ABENDS0C4-4 when starting.

The CEEDUMP from the ABEND starts with: CEE3204S The system detected a protection exception (System Completion Code=0C4).
From compile unit ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c at
entry point logConfigureDestination at statement 436 at compile unit offset +0000000013CACD06 at entry offset
+00000000000001CE at address 0000000013CACD06.

Steps to Reproduce Fails every time when I try to start ZWESLSTC.

Details I just applied the PTFs to bring Zowe up to 2.6.1 from 2.2.0; created the SZWELOAD data set...but can't see where to put that into any JCL.

The first time I tried it after the upgrade it failed, and included messages about some obsolete LE options I had in my CEEOPTS data set. I commented those out (ALL31, ANYHEAP, HEAP) and tried again; still failed, but this time it didn't complain about any obsolete LE parameter specifications.

This is occurring under z/OS V2R5. Zowe V2.6.1, UO02064, UO02065

logs:

Information for enclave main

Information for thread 1449780000000000

Traceback:
DSA Entry E Offset Statement Load Mod Program Unit Service Status
1 CEEHDSP +00003FD8 CELQLIB CEEHDSP HLE77D0 Call
2 CEEOSIGJ +0000095C CELQLIB CEEOSIGJ HLE77D0 Call
3 CELQHROD +00000266 CELQLIB CELQHROD HLE77D0 Call
4 CEEOSIGG -09D5A350 CELQLIB CEEOSIGG HLE77D0 Call
5 CELQHROD +00000266 CELQLIB CELQHROD HLE77D0 Call
6 logConfigureDestination
+000001CE 436 ZWELNCH logging.c Exception
7 logConfigureStandardDestinations
+0000005E 480 ZWELNCH logging.c Call
8 main +00000120 1246 ZWELNCH main.c Call
9 CELQINIT +00001ACA CELQLIB CELQINIT HLE77D0 Call

DSA   DSA Addr          E  Addr             PU Addr           PU Offset   Comp Date Compile Attributes                   
1     00000050082F9680  0000000011DA5980    0000000011DA5980  00003FD8    20210715  CEL       POSIX  XPLINK  EBCDIC  HFP 
2     00000050082FC7E0  0000000012057AE0    0000000012057AE0  0000095C    20210317  CEL       POSIX  XPLINK  EBCDIC  HFP 
3     00000050082FD1E0  0000000011DB9390    0000000011DB9390  00000266    20210317  CEL       POSIX  XPLINK  EBCDIC  HFP 
4     00000050082FD3E0  0000000012050AD0    0000000012050AD0  09D5A350    20210525  CEL       POSIX  XPLINK  EBCDIC  HFP 
5     00000050082FE400  0000000011DB9390    0000000011DB9390  00000266    20210317  CEL       POSIX  XPLINK  EBCDIC  HFP 
6     00000050082FE600  0000000013CACB38    0000000000000000  ********    20230110  C/C++     POSIX  XPLINK  EBCDIC  IEEE
7     00000050082FE7C0  0000000013CACDB8    0000000000000000  ********    20230110  C/C++     POSIX  XPLINK  EBCDIC  IEEE
8     00000050082FE8C0  0000000013C62BA8    0000000000000000  ********    20230110  C/C++     POSIX  XPLINK  EBCDIC  IEEE
9     00000050082FF200  0000000011B96010    0000000011B96010  00001ACA    20210317  CEL       POSIX  XPLINK  EBCDIC  HFP 

Fully Qualified Names                                                                                                    
DSA   Entry       Program Unit                                        Load Module                                        
6     logConfigureDestination                                                                                            
                  ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c
                                                                      ZWELNCH                                            
7     logConfigureStandardDestinations                                                                                   
                  ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c
                                                                      ZWELNCH                                            
8     main        ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../src/main.c                      
                                                                      ZWELNCH                                            

Condition Information for Active Routines
Condition Information for (see message CEE3843I below) (DSA address 00000050082FE600)
CIB Address: 00000050082FA9C8
Current Condition:
CEE0198S The termination of a thread was signaled due to an unhandled condition.
Original Condition:
CEE3204S The system detected a protection exception (System Completion Code=0C4).
Location:
Program Unit: (see message CEE3843I below)
Entry: logConfigureDestination
Statement: 436 Offset: +000001CE
CEE3843I The program unit name is too long to be displayed. See the Fully Qualified Names section for the complete name.
Machine State:
ILC..... 0000 Interruption Code..... 0004
PSW..... 0785040180000000 0000000013CACD06
CEE3DMP V2 R5.0: Condition processing resulted in the unhandled condition. Thu Feb 16 20:07:37 2023 Page: 2 ASID: 00C0 Job ID: STC04769 Job name: ZWESLSTC Step name: ZWELNCH PID: 67109054 Parent PID: 1 User name: ZWESLSTC

    GPR0..... 033E030012400020  GPR1..... 0000000008714D30  GPR2..... 00000000008F0000  GPR3..... 0000000013CADD8E            
    GPR4..... 00000050082FE600  GPR5..... 000000500860C670  GPR6..... 033E030012400020  GPR7..... 0000000000000000            
    GPR8..... 00000050082FF8D8  GPR9..... 0000000013CADAA8  GPR10.... 00000050001055E8  GPR11.... 00000050087179D0            
    GPR12.... 0000005000000003  GPR13.... 0000000000006F28  GPR14.... 0000000000000880  GPR15.... 0000000011B96010            
    FPC...... 00000000                                                                                                        
    FPR0..... 4F00DCDE  70846A8B            FPR1..... 4124D763  776AAA2B                                                      
    FPR2..... 004AB400  00000840            FPR3..... 413243F6  A8885A31                                                      
    FPR4..... 406F2DEC  549B9439            FPR5..... 416487ED  5110B461                                                      
    FPR6..... 40B17217  F7D1CF7A            FPR7..... 411921FB  54442D18                                                      
    FPR8..... 00000000  00000000            FPR9..... 00000000  00000000                                                      
    FPR10.... 00000000  00000000            FPR11.... 00000000  00000000                                                      
    FPR12.... 00000000  00000000            FPR13.... 00000000  00000000                                                      
    FPR14.... 00000000  00000000            FPR15.... 00000000  00000000                                                      
hockeyrob commented 1 year ago

I added your example value for the HEAP64 parameter, which was: HEAP64(512M,4M,KEEP,256M,4M,KEEP,OK,FREE) and got:

CEE3792I The following messages pertain to the DD:CEEOPTS dataset run-time options.
CEE3614I An invalid character occurred in the numeric string 'OK' of the run-time option HEAP64.
CEE3614I An invalid character occurred in the numeric string 'FREE' of the run-time option HEAP64.

And, no change, still the same ABEND.

hockeyrob commented 1 year ago

Ok, not the same result....it got the same ABEND, but with:

CEE3204S The system detected a protection exception (System Completion Code=0C4).
From compile unit ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c at
entry point logConfigureDestination at statement 426 at compile unit offset +0000000013CACC2A at entry offset
+00000000000000F2 at address 0000000013CACC2A.

balhar-jakub commented 1 year ago

@1000TurquoisePogs Can you please take a look at this problem?

pinpan commented 1 year ago

Looks like in the example there is a typo in the 7th parameter (the '0K' one) - there is 'OK' instead of the maybe needed '0'=like ZERO 'K'. See: https://www.ibm.com/docs/en/zos/2.4.0?topic=options-heap64-amode-64-only

hockeyrob commented 1 year ago

Thanks, pinpan....fixed that. It did have an "O" instead of "0", and there needs to be another numeric parameter after it, so where you have "OK" you need "0K,0K". No real change; this is the first output:

CEE3204S The system detected a protection exception (System Completion Code=0C4).
From compile unit ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c at
entry point logConfigureDestination at statement 426 at compile unit offset +0000000013CACC2A at entry offset
+00000000000000F2 at address 0000000013CACC2A.

so, same result. But you're right, the HEAP64 spec needs changed.

As a shot in the dark I started ZWESISTC so the cross-memory server was running when I start ZWESLSTC; the cross-memory server starts ok, and has been running now for several days, but I still get the listed crash from ZWESLSTC.

What else can I send you?

Joe-Winchester commented 1 year ago

Working with a customer today @ifakhrutdinov suggested that the error was because all of the components starting on the same address space were colliding with storage access.
The fix that worked for the customer was to change KEEP,256M -> KEEP,32M.
We may need to update our documentation chapter https://docs.zowe.org/stable/user-guide/configure-uss/#language-environment. @samanthasusu

pj892031 commented 1 year ago

If I understand well, there was not enough room for memory allocation. It means the method makeLocalLoggingContext called safeMalloc31 to allocate the memory but the response was NULL. What about improving safeMalloc31 or adding detection of not created LoggingContext (rather both)? I guess once a C application asks for memory and it is not created it should at least write a log message about and the missing LoggingContext should end with an ABEND.

Missing verification (+exit) if it is not null: https://github.com/zowe/launcher/blob/2be472b794647198f756b86cc0ba10223edb094a/src/main.c#L1474

Location to add a new log message about missing resources: https://github.com/zowe/zowe-common-c/blob/541462f70ceff3ca0066aacc203b03df50cdd3d4/c/alloc.c#L491 https://github.com/zowe/zowe-common-c/blob/541462f70ceff3ca0066aacc203b03df50cdd3d4/c/alloc.c#L530 https://github.com/zowe/zowe-common-c/blob/541462f70ceff3ca0066aacc203b03df50cdd3d4/c/alloc.c#L567 https://github.com/zowe/zowe-common-c/blob/541462f70ceff3ca0066aacc203b03df50cdd3d4/c/alloc.c#L844 https://github.com/zowe/zowe-common-c/blob/541462f70ceff3ca0066aacc203b03df50cdd3d4/c/alloc.c#L1053

1000TurquoisePogs commented 1 year ago

The recommendation on lower memory to avoid out of memory situation was merged 2 weeks ago, it just needs to be propagated throughout the website https://github.com/zowe/docs-site/pull/2580

Joe-Winchester commented 1 year ago

Related to https://github.com/zowe/docs-site/pull/2664.

Joe-Winchester commented 1 year ago

@1000TurquoisePogs and @ifakhrutdinov . In the PR https://github.com/zowe/docs-site/pull/2580 the recomendation is

`HEAP64(4M,4M,KEEP,1M,1M,KEEP,0K,0K,FREE)

whereas when working with the customer Irek suggested

`HEAP64(512M,4M,KEEP,32M,4M,KEEP,0K,FREE)

Both are much less than the 256M that was causing the abend. Which version would you like as the "single version of truth" going forward ? If the smaller values in Sean's 2580 work then we could go with that, however the larger values seem to work also and give us more headroom.

hockeyrob commented 1 year ago

I don't think it's a HEAP64 issue; I've tried:

HEAP64(512M,4M,KEEP,128M,4M,KEEP,0K,FREE)
HEAP64(512M,4M,KEEP,256M,4M,KEEP,0K,0K,FREE) HEAP64(512M,4M,KEEP,128M,4M,KEEP,0K,0K,FREE) HEAP64(512M,4M,KEEP,64M,4M,KEEP,0K,0K,FREE) HEAP64(512M,4M,KEEP,32M,1M,KEEP,0K,0K,FREE) HEAP64(512M,4M,KEEP,1M,1M,KEEP,0K,0K,FREE)
HEAP64(512M,4M,KEEP,1M,256K,KEEP,0K,0K,FREE)

And then tried reducing the 64-bit heap: HEAP64(256M,4M,KEEP,1M,1M,KEEP,0K,0K,FREE) HEAP64(64M,4M,KEEP,1M,1M,KEEP,0K,0K,FREE)

One of these gave me a slightly different result...

CEE3204S The system detected a protection exception (System Completion Code=0C4).
From compile unit ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c at
entry point logConfigureDestination at statement 436 at compile unit offset +0000000013CACD06 at entry offset
+00000000000001CE at address 0000000013CACD06.

But the rest gave me the same result as before:

CEE3204S The system detected a protection exception (System Completion Code=0C4).
From compile unit ZZOW04:/ZOWE/tmp/pax-packaging-launcher-1673380536536/content/build/../deps/launcher/common/c/logging.c at
entry point logConfigureDestination at statement 426 at compile unit offset +0000000013CACC2A at entry offset
+00000000000000F2 at address 0000000013CACC2A.

So, I don't think it's a HEAP64 problem, unless I need to give it a lot more than 512M above the bar.

Going forward, you need to make sure to specify 0K,0K for the below-the-line storage; LE will complain if you omit the secondary value.

balhar-jakub commented 1 year ago

When helping with this problem there was another thing that needed to happen and it was to change configuration in zowe.yaml and provide following key with the value false.

zowe.launcher.shareAs: false

The info about setting it is here: https://docs.zowe.org/stable/appendix/zowe-yaml-configuration/#launcher-and-launch-scripts

hockeyrob commented 1 year ago

The problem appears to have been with HEAPPOOLS. I turned on RPTSTG and it recommended a different setting, which also didn’t work, and recommended HEAPP=(OFF), which let me get past that problem. Messages in the launcher output say HEAP64 is an invalid runtime option or is not supported in this release of LE.

Bottom line, Zowe still isn’t working, but I got past this ABEND. I’m going to stop the launcher and restart without RPTSTG.

hockeyrob commented 1 year ago

I tried adding it before the components section as zowe.launcher.shareAS: false

and by adding launcher: shareAs: false

and both give me this same result when trying to start ZWESLSTC:

2023-02-23 15:41:14 ZWESLSTC INFO ZWEL0021I Zowe Launcher starting 2023-02-23 15:41:14 ZWESLSTC INFO ZWEL0023I Zowe YAML config file is 'FILE(/etc/zowe.yaml)' 2023-02-23 15:41:14 ZWESLSTC INFO ZWEL0024I HA_INSTANCE_ID is '{{ha_instance_id}}' 2023-02-23 15:41:14 ZWESLSTC INFO ZWEL0017I ROOT_DIR is '/usr/lpp/zowe' 2023-02-23 15:41:14 ZWESLSTC ERROR ZWEL0070E Configuration has validity exceptions: Validity Exceptions(s) with object at Validity Exceptions(s) with object at /zowe Validity Exceptions(s) with object at /zowe/launcher unspecified additional property not allowed: 'shareAS' at '/zowe/launcher/shareAS'

So, where and how do YOU specify it? The doc at the specified link didn’t show either.

From: Jakub Balhar @.> Sent: Thursday, February 23, 2023 3:45 AM To: zowe/community @.> Cc: Hamilton, Robert @.>; Author @.> Subject: [EXT] Re: [zowe/community] Zowe 2.6.1 fails with ABENDS0C4-4 when starting (Issue #1852)

[Actual Sender is @.**@.>]

When helping with this problem there was another thing that needed to happen and it was to change configuration in zowe.yaml and provide following key with the value false.

zowe.launcher.shareAs: false

The info about setting it is here: https://docs.zowe.org/stable/appendix/zowe-yaml-configuration/#launcher-and-launch-scripts

— Reply to this email directly, view it on GitHubhttps://github.com/zowe/community/issues/1852#issuecomment-1441384946, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2B6KPVWFGYHYQMMKNITKQLWY4PPRANCNFSM6AAAAAAU6UENJE. You are receiving this because you authored the thread.Message ID: @.**@.>> Confidentiality Notice: This electronic message transmission, including any attachment(s), may contain confidential, proprietary, or privileged information from CAS, a division of the American Chemical Society ("ACS"). If you have received this transmission in error, be advised that any disclosure, copying, distribution, or use of the contents of this information is strictly prohibited. Please destroy all copies of the message and contact the sender immediately by either replying to this message or calling 614-447-3600.

Joe-Winchester commented 1 year ago

@hockeyrob.

... where and how do YOU specify it? The doc at the specified link didn’t show either

In the zowe.yaml file add two lines. The first is launcher: starting at column 2 (if you don't already have this present) and beneath that starting at column 4 shareAs: false.

image

Looking at the error you're getting this maybe is because you have shareAS and not shareAs ?

hockeyrob commented 1 year ago

I put it after the job: specification, under the zowe: specification:

. . . # >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

runtime z/OS job name

job:

Zowe JES job name

name: ZWESLSTC
# Prefix of component address space
prefix: ZOWE

launcher: shareAs: false

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

This is an ID you use to separate multiple Zowe installs when determining

resource names used in RBAC authorization checks such as dataservices with RBAC

expects this ID in SAF resources

rbacProfileIdentifier: "1" . . .

This is what it gave me:

2023-02-23 16:09:44 ZWESLSTC INFO ZWEL0021I Zowe Launcher starting 2023-02-23 16:09:44 ZWESLSTC INFO ZWEL0023I Zowe YAML config file is 'FILE(/etc/zowe.yaml)' 2023-02-23 16:09:44 ZWESLSTC INFO ZWEL0024I HA_INSTANCE_ID is '{{ha_instance_id}}' 2023-02-23 16:09:44 ZWESLSTC INFO ZWEL0017I ROOT_DIR is '/usr/lpp/zowe' 2023-02-23 16:09:44 ZWESLSTC ERROR ZWEL0070E Configuration has validity exceptions: Validity Exceptions(s) with object at Validity Exceptions(s) with object at /zowe Validity Exceptions(s) with object at /zowe/launcher unspecified additional property not allowed: 'shareAs' at '/zowe/launcher/shareAs'

From: Joe Winchester @.> Sent: Thursday, February 23, 2023 11:05 AM To: zowe/community @.> Cc: Hamilton, Robert @.>; Mention @.> Subject: [EXT] Re: [zowe/community] Zowe 2.6.1 fails with ABENDS0C4-4 when starting (Issue #1852)

[Actual Sender is @.**@.>]

@hockeyrobhttps://github.com/hockeyrob.

... where and how do YOU specify it? The doc at the specified link didn’t show either

In the zowe.yaml file add two lines. The first is launcher: starting at column 2 (if you don't already have this present) and beneath that starting at column 4 shareAs: false.

[image]https://user-images.githubusercontent.com/28100302/220962938-372d24d3-4297-4135-af07-671ce5f2243e.png

Looking at the error you're getting this maybe is because you have shareAS and not shareAs ?

— Reply to this email directly, view it on GitHubhttps://github.com/zowe/community/issues/1852#issuecomment-1442037666, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2B6KPUZOYE3TCLJPMJKKK3WY6DEHANCNFSM6AAAAAAU6UENJE. You are receiving this because you were mentioned.Message ID: @.**@.>> Confidentiality Notice: This electronic message transmission, including any attachment(s), may contain confidential, proprietary, or privileged information from CAS, a division of the American Chemical Society ("ACS"). If you have received this transmission in error, be advised that any disclosure, copying, distribution, or use of the contents of this information is strictly prohibited. Please destroy all copies of the message and contact the sender immediately by either replying to this message or calling 614-447-3600.

hockeyrob commented 1 year ago

Yep, noticed the capitalization error…..retried with it corrected, and got the same result. Confidentiality Notice: This electronic message transmission, including any attachment(s), may contain confidential, proprietary, or privileged information from CAS, a division of the American Chemical Society ("ACS"). If you have received this transmission in error, be advised that any disclosure, copying, distribution, or use of the contents of this information is strictly prohibited. Please destroy all copies of the message and contact the sender immediately by either replying to this message or calling 614-447-3600.

hockeyrob commented 1 year ago

So, the launcher still wasn't starting the gateway. I added a few items into my zowe.yaml that were new (since my last install) in example-zowe.yaml, including using the configmgr with validation set to STRICT, so I can see what is failing and when. I added the onComponentConfigureFail: warn, and set debug to "true" for the gateway. This was all so I could see why the gateway won't start for me.

And, I see that there are some protection exceptions (more 0C4s) in logging.c at entry point makeLocalLoggingContext. There are two of these errors, as the gateway and zss fail to start.

1000TurquoisePogs commented 1 year ago

The problem appears to have been with HEAPPOOLS. I turned on RPTSTG and it recommended a different setting, which also didn’t work, and recommended HEAPP=(OFF), which let me get past that problem. Messages in the launcher output say HEAP64 is an invalid runtime option or is not supported in this release of LE.

Bottom line, Zowe still isn’t working, but I got past this ABEND. I’m going to stop the launcher and restart without RPTSTG.

Thank you for that. It's a problem known to me but I thought we had fixed it. It's just another occurrence and we hadn't covered them all, so I hope we'll be better off with this change https://github.com/zowe/launcher/pull/64

1000TurquoisePogs commented 1 year ago

regarding zowe.launcher.shareAs, the schema documents its values can be "yes" or "no", not true/false: https://github.com/zowe/zowe-install-packaging/blob/v2.x/staging/schemas/zowe-yaml-schema.json#L476 You may not need it at all, but you can give it a try.

hockeyrob commented 1 year ago

Pretty bizarre.... I added this to zowe.yaml, right after the launchScript group:

launcher: shareAs: no

and......

2023-02-24 17:32:03 ZWESLSTC INFO ZWEL0021I Zowe Launcher starting
2023-02-24 17:32:03 ZWESLSTC INFO ZWEL0023I Zowe YAML config file is 'FILE(/etc/zowe.yaml)' 2023-02-24 17:32:03 ZWESLSTC INFO ZWEL0024I HA_INSTANCE_ID is '{{ha_instance_id}}'
2023-02-24 17:32:03 ZWESLSTC INFO ZWEL0017I ROOT_DIR is '/usr/lpp/zowe'
2023-02-24 17:32:03 ZWESLSTC ERROR ZWEL0070E Configuration has validity exceptions:
Validity Exceptions(s) with object at
Validity Exceptions(s) with object at /zowe
Validity Exceptions(s) with object at /zowe/launcher
unspecified additional property not allowed: 'shareAs' at '/zowe/launcher/shareAs'

Taking it back out and retrying. There are still a number of protection exceptions when it starts, mostly from configmgr, which keeps other things from starting. Presuming the SMP/E update was good enough to get the software to 2.6.1, I'm guessing I have problems in the zowe.yaml configuration, so I'm going to write up what I have in mine that's different from the example-zowe.yaml that was distributed. I'll get back to you as soon as I have that.

hockeyrob commented 1 year ago

Okay, these are the configuration items in zowe.yaml that I have changed from the example:

zowe.setup.dataset: prefix, proclib, parmlib, jcllib, loadlib, authLoadlib, authPluginLib zowe.setup.security: product, groups, users, stcs zowe.setup.certificate.pkcs12: directory, lock, etc... zowe.runtimeDirectory zowe.logDirectory zowe.workspaceDirectory zowe.extensionDirectory zowe.configmgr.validation: "STRICT" zowe.job: name, prefix zowe.cookieIdentifier zowe.externalDomains zowe.certificate: keystore values, truststore values, pem values zowe.verifyCertificates: DISABLED java.home node.home zOSMF: host, port components.gateway.debug: true components.caching-service.enabled: false components.zss.tls: false

The caching service is disabled; should we want it it will be vsam, so the infinispan configuration item(s) are removed. Other than these items, everything matches what's in example-zowe.yaml

None of the certificate items should matter, since I've specified verifyCertificates as DISABLED. The launcher can find node and java. Still getting 0C4 protection exceptions which keep the gateway, api-catalog and discovery from starting. What else should I change to get this thing working?

Joe-Winchester commented 1 year ago

@hockeyrob , did you indent so it looks like

  launcher:
    shareAs: no
1000TurquoisePogs commented 1 year ago

The "shareAs" violation is due to a typo discovered last month in that schema file which isnt fixed until the to-be-release zowe 2.7. It specifically effects "zowe.launcher.shareAs" but does not affect "components.componentname.launcher.shareAs" which should work. It's the difference between global & per-component. The fix for the global one is basically indentation. within zowe/schemas/zowe-yaml-schema.json

change

        "launcher": {
          "type": "object",
          "description": "Set default behaviors of how the Zowe launcher will handle components",
          "additionalProperties": false,
          "properties": {
            "restartIntervals": {
              "type": "array",
              "description": "Intervals of seconds to wait before restarting a component if it fails before the minUptime value.",
              "items": {
                "type": "integer"  
              },
              "minUptime": {
                "type": "integer",
                "default": 90,
                "description": "The minimum amount of seconds before a component is considered running and the restart counter is reset."
              },
              "shareAs": {
                "type": "string",
                "description": "Determines which SHAREAS mode should be used when starting a component",
                "enum": ["no", "yes", "must", ""],
                "default": "yes"

              }
            }
          }
        },

to

        "launcher": {
          "type": "object",
          "description": "Set default behaviors of how the Zowe launcher will handle components",
          "additionalProperties": false,
          "properties": {
            "restartIntervals": {
              "type": "array",
              "description": "Intervals of seconds to wait before restarting a component if it fails before the minUptime value.",
              "items": {
                "type": "integer"  
              }
            },
            "minUptime": {
              "type": "integer",
              "default": 90,
              "description": "The minimum amount of seconds before a component is considered running and the restart counter is reset."
            },
            "shareAs": {
              "type": "string",
              "description": "Determines which SHAREAS mode should be used when starting a component",
              "enum": ["no", "yes", "must", ""],
              "default": "yes"
            }
          }
        },
hockeyrob commented 1 year ago

@hockeyrob , did you indent so it looks like

  launcher:
    shareAs: no

Yes, I did the proper indentation...2 spaces before launcher and 4 before shareAs.

hockeyrob commented 1 year ago

@1000TurquoisePogs : Made that change. It starts up faster. Still not accepting incoming connections. Gateway is configured to use port 7554, but there is nothing listening on that port. The only port on which anything is listening is 7557, ZSS. I'm doing more searching through the sysprint and other logs to see what else may have happened. Film at 11.

hockeyrob commented 1 year ago

@1000TurquoisePogs Okay, it's starting up quickly now, which gives me more opportunities to kill it, change something, restart it, and iterate. I've set two of the debugging flags; don't want to set too many to keep down the noise. Still not working.

At this point the original problem is resolved; since I updated that schema it hasn't gotten any protection exceptions, so we can close this case, if you want. It's still not working; 7557 is the only port on which some part of the application is listening, so...the gateway isn't listening, even though the launcher seems to say it's up.

When I stop ZWESLSTC, whether i use zwe stop or the STOP operator command, 8 or 10 process remain running, and I have to cancel most individually by ASID. When I do, I get messages about the task not being undubbed...so I started socket/sockapi traces to see whether there were any tasks connecting to TCP/IP and just not creating a socket for bind/listen...to no avail. There were several things (~100) in the trace that I think are from shell scripts checking on available/required ports...but no other connections to TCP/IP by the ZWESLSTC job.

I've specified verifyCertificates:DISABLED, because I don't have all the certificate pieces figured out from your descriptions, so all i want to do is get the thing running, and then I'll worry about certificates when they matter. Looks like it is still complaining about the pieces/parts that are only partly complete; certificate.pem.key just had a file name and no associated file, which it then couldn't read at all, no surprise. I copied some data to that file, and now I'm getting a complaint about "Unparsed DER bytes remain after ASN.1 parsing" of my CA cert.....I thought when I said not to verify certificates it would, you know, not verify certificates. Is this the next problem to resolve? I've created a .kdb with gskkyman, created/imported a CA cert, generated a server cert....but I can't tell whether Zowe can read the .kdb, or if I need to extract all the certs in the .kdb, and make up some keys for those certs, some of which won't have any...and then figure out....OK, it's getting late. I'll beat on this again tomorrow.

1000TurquoisePogs commented 1 year ago

I'd appreciate if this issue ticket was closed and a new one opened up, because for the sake of an open source project, other users with the same problems will want to search for them, and wouldn't find your latest issue at the bottom of this conversation.

Can you make a new issue ticket focusing on your lack of ports listening? As far as I know, STOP is the right command to issue, but STOP sends a SIGTERM to unix processes, and if they are stuck, they may not end without a SIGKILL. Perhaps the fact you cannot STOP correctly is related to why the servers are also not listening.

I thought when I said not to verify certificates it would, you know, not verify certificates.

It controls verification of certificates on network traffic, and if the certificates have valid claims like expiration date and hostname, but servers still need certificates to present to the browsers, so if there's a certificate parsing error it's still an issue that must be solved.

zwe init certificate can make you a simple keystore, though there is a known issue where the newest versions of java create a keystore that can't be read by systemssl https://github.com/zowe/docs-site/issues/2459 which ZSS uses (7557) though the rest of Zowe would be uneffected, so basically give zwe init certificate a try to make progress, and certificates can be further configured later, as it's often the most time consuming last step.

hockeyrob commented 1 year ago

Closing this issue. Will open another after spending more time trying to figure out the certs.