SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
249 stars 92 forks source link

z/VM 6.4 Issues FILE PROTECT errors on DASD #166

Closed fbi-ranger closed 5 years ago

fbi-ranger commented 5 years ago

During logon of service machines of z/VM 6.4 the OPERATOR gets a number of FILE PROTECT errors on the z/VM system disks.

11:42:52 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:52 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:52 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:52 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:52 HCPERP6302I SEEK ADDRESS =   000001ED0004 
11:42:52 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:52 HCPERP6303I 00000000 00000080 0001ED04  
11:42:52 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:52 HCPERP6305I USERID = EREP     
11:42:52 HCPERP2216I CHANNEL PATH ID = 64
11:42:52 HCPCLS6056I XAUTOLOG information for AUTOLOG1: The IPL command is verifed by the IPL command processor.
11:42:52 HCPCLS6056I XAUTOLOG information for DISKACNT: The IPL command is verifed by the IPL command processor.
11:42:52 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:52 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:52 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:52 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:52 HCPERP6302I SEEK ADDRESS =   000001E70002 
11:42:52 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:52 HCPERP6303I 00000000 00000080 0001E702  
11:42:52 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:52 HCPERP6305I USERID = EREP     
11:42:52 HCPERP2216I CHANNEL PATH ID = 64
11:42:52 HCPCLS6056I XAUTOLOG information for OPERSYMP: The IPL command is verifed by the IPL command processor.
11:42:53 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:53 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:53 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:53 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:53 HCPERP6302I SEEK ADDRESS =   000001E80001 
11:42:53 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:53 HCPERP6303I 00000000 00000080 0001E801  
11:42:53 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:53 HCPERP6305I USERID = EREP     
11:42:53 HCPERP2216I CHANNEL PATH ID = 64
11:42:53 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:53 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:53 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:53 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:53 HCPERP6302I SEEK ADDRESS =   000001E90001 
11:42:53 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:53 HCPERP6303I 00000000 00000080 0001E901  
11:42:53 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:53 HCPERP6305I USERID = DISKACNT 
11:42:53 HCPERP2216I CHANNEL PATH ID = 64
11:42:53 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:53 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:53 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:53 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:53 HCPERP6302I SEEK ADDRESS =   000001EA0001 
11:42:53 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:53 HCPERP6303I 00000000 00000080 0001EA01  
11:42:53 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:53 HCPERP6305I USERID = EREP     
11:42:53 HCPERP2216I CHANNEL PATH ID = 64
11:42:53 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:53 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:53 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:53 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:53 HCPERP6302I SEEK ADDRESS =   000001EB0001 
11:42:53 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
11:42:53 HCPERP6303I 00000000 00000080 0001EB01  
11:42:53 HCPERP6304I IRB = 00C24017 07F47620 0E400008 00800000  
11:42:53 HCPERP6305I USERID = AUTOLOG1 
11:42:53 HCPERP2216I CHANNEL PATH ID = 64
11:42:54 HCPERP513I  DASD  6400 AN OPERATION WAS TERMINATED BECAUSE A FILE 
11:42:54 HCPERP513I  PROTECT ERROR OCCURRED 
11:42:54 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
11:42:54 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
11:42:54 HCPERP6302I SEEK ADDRESS =   000001EC0001 
11:42:54 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  

Here is the associated CCW trace that was done during the startup of those service machines:

Fish-Git commented 5 years ago

One of my SoftDevLabs customers (a long time mainframer with 30 years experience as a Systems Programmer) has identified the root cause of this problem and sent me his fix for it (which I will be committing shortly).

As he explained it to me, the File Protect errors are occurring due to incorrect Hercules handling of a "suffixed" multitrack Read Count CCW at the end of a Locate Record Extended domain. It's supposed to return the count field of the first record that was read on that track, but is instead automatically advancing to the next track (because of multitrack) thereby leading to the File Protect error (since, after the track advancement, the I/O is now outside the valid range set by DE/LRE).

For reference, he referred me to pages 47-48 of manual SC26-7298-01: "Enterprise Storage Server - System-390 Command Reference - 2105 Models E10, E20, F10, and F20".

Fish-Git commented 5 years ago

Should now (hopefully!) be fixed by commit 4a8ad33dba76221d0e9fa56595d85aa7c13ef29b.

@fbi-ranger Florian? Can you please confirm? (I don't have z/VM 6.4)

Thanks!

fbi-ranger commented 5 years ago

I can confirm that this issue is now fixed and can be closed! z/VM 6.4 IPLs now without complaint.

Thanks for this really quick fixing!!

Small annotation: I think this problem is not related to z/VM 6.4 only. I faced this problem already with z/VM 6.3 back in April 2017 when using full pack minidisks for a z/OS guest. This problem is reported as "Using DASD on emulated CU 3990-3 or -6 generates HCPERP513I error under z/VM (#218)" in Hercules 4.0.0, where the issue is "still under discussion".

Fish-Git commented 5 years ago

I can confirm that this issue is now fixed and can be closed! z/VM 6.4 IPLs now without complaint.

Fantastic! I will close this issue.

Thanks for this really quick fixing!!

Don't thank me! Thank Bob Wood. He's the one that provided the fix. All I did was commit it.

fbi-ranger commented 5 years ago

Thanks to Bob Wood. Excellent work!

BR Florian

Fish-Git commented 5 years ago

Something I forgot to ask you @dasdman Mark, was whether or not Bob's code to fix this issue (commit 4a8ad33dba76221d0e9fa56595d85aa7c13ef29b) seemed reasonable/correct to you or not.

Based on my own quick review it certainly looks completely reasonable and correct to me, but I admit to not taking a close look at it. I only gave it a quick once over.

If it's not too much trouble I'd very much appreciate you giving it a quick review to see whether or not anything jumps out at you as being wrong. Thanks!

Fish-Git commented 5 years ago

Mark (dasdman) replied off list with the following response (paraphrased):

It is "functional" code, but it won't pass my test mechanisms (it breaks). The unfortunate part is that I don't have the cycles at present(*) to isolate the failing test cases into individual generic and specific issues. On a quick scan of the results, it appears to introduce three new issues.

Also, I have a problem with the ad-hoc addition of 0xE7, without any additional references, and treating it identical to Locate Record. The addition of 0xE7 means that there are fields - by definition - that are being improperly handled with no trail that says anything about the new command code.


(*) I have to follow the potential money trail at present with the project I'm working on...

  I wish I knew what the three new issues were (if I knew then Bob or I might have a chance of maybe fixing them), but I really don't want to bother Mark any further. (**)

As far as the undocumented E7 code goes, I'll try to go back and add some comments mentioning the manual Bob said he used as reference: SC26-7298-01: "Enterprise Storage Server - System-390 Command Reference - 2105 Models E10, E20, F10, and F20". (What sucks about that manual though, is while it documents the various channel commands (their behavior/functioning), it doesn't document what the CCW opcodes are! IBM up to their old tricks again!)


(**) When I spoke to him on the phone he said he was in a severe financial pinch due to a long term health issue with a family member and his/their health insurance wasn't covering its full cost. I sure wish there was something we could for him! It sounds like he's really hurting. :(

Fish-Git commented 5 years ago

I've decided to re-open this issue because there is clearly more work to be done.

While I don't know what the three new issues are that Mark mentioned (see previous comment), one thing I was thinking we could possibly do is re-introduce (*) support for the cu=2105 control unit option.

Then I could maybe go back through Bob's new code that fixed this issue and wrap it with a test for control unit model 2105 in the hopes that maybe that might fix some of the issues Mark mentioned.

This would of course require users experiencing the specific issue that this GitHub Issue was created for (i.e. the File Protect errors that occur on z/VM that Florian menioned) to then have to specify cu=2105 on their Hercules dasd device configuration file statements, but from the sounds of it that's technically what they should be doing anyway! Yes?

How do the rest of you feel about this?

I don't want to do anything yet unless the rest of you feel what I'm suggesting is indeed what should be done. Thanks!

 


(*) See e.g.: https://github.com/SDL-Hercules-390/hyperion/blob/6cab259d0d946de4fb21ec4e33984ee5d9f2cd9f/dasdtab.c#L197

and:

https://github.com/SDL-Hercules-390/hyperion/blob/6cab259d0d946de4fb21ec4e33984ee5d9f2cd9f/ckddasd.c#L6010-L6011

  So it appears we did have support for 'cu=2105' at one time!  

fbi-ranger commented 5 years ago

Dear all,

1.) With the actual fix applied I run z/OS 1.13, VSE/ESA 2.4 on z/VM 6.4 without any problem. At least the IPL process and the "normal" work does not show any abnormal behavior of those OSes. Under z/OS I also was doing some weird things such as zapping a FORMAT4 DSCB to convert the DASD from IX to OS format and vice versa. All worked well. Of course that doesn't mean that something else can't pop up. Anyway the applied fix solved the problem I opened this issue for.

2.) I found during my comparison between zPDT (2107) and Herc (3990) that in NED (node descriptors) and NED2 Bit 3 of Byte 0 should be changed to '1' as in the DEVSERV command the DEV-Serial shows up as *INVALID*:

IEE459I 16.09.33 DEVSERV QDASD       FRAME  1     F      E   SYS=ADCD113
 UNIT VOLSER SCUTYPE DEVTYPE       CYL  SSID SCU-SERIAL DEV-SERIAL EFC  
00A80 ZDRES1 39900C2 339000A      3339  0A80 XXZZ-00001 *INVALID*  BYP  

In the current setting, bit 3 = 0 it says that Serial number is invalid. Bit3 = 1 would be consistent for all 3990 and up incl. DS8700 and DS8870.

In NED2 the Byte 1 should be set to 02 (currently 00) and in NED3 Byte 1 should be 00 (currently 02). I am not sure if this is only cosmetic.

Another example: On the real machines DS8700 and DS8870 I see that they have in byte 62 and 63 the number of cylinders stated, while in the specs of 3990-x these bytes are reserved.

Furthermore in the device characteristics information: Byte 6 Bits4-6 are for 3390 on 3990 reserved. On the real machines DS8700 and DS8870 they are set to 111. So indeed the z/PDT reflects more the undocumented functions and is therefore very close to the real iron than we are with the 3990-3/9 implementation.

I have added the devserv commands from DS8700 and a DS8870:

3.) Only a proposal from my side, I would like to discuss.

Wouldn't make it sense if the functionality of dasdtab.c and ckddasd.c was split into several distinct parts? I find it absolutely confusing what is going on there, with all the differences between the CUs, devices, advanced mode, etc.

I would like to propose that all the legacy devices, which probably will not see much changes any more, are kept in the current modules but for the more current devices such as 2105 and greater, especially with undocumented CCWs, we separate it into a new set of modules. There should be an basic implementation of the I/O and then a device specific module for a 3390-x or 2107-x etc. I think that we would save a lot of if / then logic when we say that device is e.g. a 2107. It have exactly that sort of behavior and do not mess in the logic again with other device types.

Exactly as it is shown above when it is not 3390 and not 2105.... I think it would help in better "understanding" the already very complex matter.

fbi-ranger commented 5 years ago

Unfortunately there is no discussion on this matter.

I faced a new FILE PROTECT issue. This time with zOS 2.3 IPL under z/VM 6.4:

09:00:36 09:00:36 HCPERP513I  DASD  3A82 AN OPERATION WAS TERMINATED BECAUSE A FILE 
09:00:36 09:00:36 HCPERP513I  PROTECT ERROR OCCURRED 
09:00:36 09:00:36 HCPERP6300I SENSE DATA FORMAT = 00       MSG CODE = 00      
09:00:36 09:00:36 HCPERP6301I CHANNEL COMMAND WORD COMMAND CODE = 92      
09:00:36 09:00:36 HCPERP6302I SEEK ADDRESS =   0000183D0003 
09:00:36 09:00:36 HCPERP6303I SENSE = 00040000 00FFFF00 00000000 00000000 00000000  
09:00:36 09:00:36 HCPERP6303I 00000000 00000080 00183D03  
09:00:36 09:00:36 HCPERP6304I IRB = 00C24017 5FFE95D0 0E400008 00800000  
09:00:36 09:00:36 HCPERP6305I USERID = S0W1     
09:00:36 09:00:36 HCPERP2216I CHANNEL PATH ID = 3A

The complete trace is here:

The DASD 3A82 is a fullpack mini disk for the guest s0w1.

Fish-Git commented 5 years ago

@fbi-ranger Florian: can you provide more information please? What exact version of SDL Hyperion are you using? (i.e. what's the commit hash? If you built it from a git clone, the Hercules version command should be good enough.) Also, what does your user direct for your z/OS guest look like? I'd like to try and reproduce this error with my existing z/VM 6.3. Does the problem occur with z/VM 6.3? Or only with z/VM 6.4? Thanks!

Fish-Git commented 5 years ago

Also, does the problem happen only under z/VM? Or does it also occur when IPL'ed natively? I have z/OS 2.3 and it IPLs fine! But I have not tried it under z/VM yet. I need to know what the user direct should look like, etc. Thanks.

fbi-ranger commented 5 years ago

Fish,

I didn't try running it without z/VM. So as your z/OS runs fine, I suppose it is related to z/VM and also to the fact that it is defined as fullpack minidisk.

The release of Hyperion is:

HHC01413I Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0)

Directory is:

USER S0W1 OP 4G 8G BFG *CRYPTO DOMAIN 14 APDEDICATED 5 COMMAND TERMINAL BREAKIN MINIMAL HI ON


Fish-Git commented 5 years ago
MDISK 0A80 3390 DEVNO 3A80 MWV NBCREAD
MDISK 0A81 3390 DEVNO 3A81 MWV NBCREAD
MDISK 0A82 3390 DEVNO 3A82 MWV NBCREAD
... etc ...

Have you tried using DEDICATE instead? (i.e. not mdisk?)

I just IPLed z/OS 2.3 under z/VM 6.3 and it worked just fine. No unusual messages or anything. Totally clean IPL.

I highly suspect it's your specifying your guest's dasd as MDISK instead of DEDICATE that's causing your file protect errors.

It's admittedly been probably 25-30 years since I've done any serious work with VM but in all my years as a VM system administrator I was always taught that guest virtual machines should always dedicate their dasds and not specify them as minidisks. I was told specifying dasd using mdisk (minidisk) statements causes a completely different CCW translation to be used by VM, whereas specifying them as dedicated dasds does not.

There's still some CCW translation occurring (I think) even when you use DEDICATE, but it's a minimal type of ccw translation since the pack is "owned" by the guest and not by the VM system. Thus the restrictions regarding I/O to the device are much more relaxed (are not as restrictive as they are when the dasd is specified as a minidisk and thus owned by the system).

Try using DEDICATE.

fbi-ranger commented 5 years ago

Dear Fish,

I know that DEDICATE works. However DEDICATE let not share the DASD between guests. (e.g. for simple SYSPLEXing wth CTCs instead of Coupling Facility).

The implementation in Hyperion works nowadays very good. So it is strange, that only on the system disk (A82) of ACDC system, it is not working.

I think it should be investigated what is the root cause of this.

Fish-Git commented 5 years ago

The implementation in Hyperion works nowadays very good. So it is strange, that only on the system disk (A82) of ACDC system, it is not working.

Does the system disk need to be shared? Maybe the system disk needs to be DEDICATED? (i.e. you can define all of the dasd as full pack minidisks except the system disk, which must be DECICATEd?)

(e.g. for simple SYSPLEXing wth CTCs instead of Coupling Facility).

I am not familiar with this, so I cannot test. (And I do not have 6.4 either. I only have 6.3)

fbi-ranger commented 5 years ago

In real world all the disks are shared so one can properly access the data from the participating guest systems. Well, especially the config disk as it contains the shared system datasets for the sysplex.

The File Protect Error happens only once during IPL of z/OS 2.3 and I think it relates to the crash of the IOSAS address space:

  DUMPID=001 REQUESTED BY JOB (IOSAS   )                                    
  DUMP TITLE=COMPON=IOS BUILD CHANNEL TABLE,COMPID=SC1C3,ISSUER=I           
             OSVCHPT                                                        
 *01 IEA793A NO DUMP DATA SETS AVAILABLE FOR DUMPID=001 BY JOB (IOSAS   ).  
   USE THE DUMPDS COMMAND OR REPLY D TO DELETE THE DUMP                     
  IEF196I IEF237I 0A80 ALLOCATED TO SYS00037                                
  IEF196I IEA995I SYMPTOM DUMP OUTPUT                                       
  IEF196I SYSTEM COMPLETION CODE=378  REASON CODE=0000001C                  
  IEF196I  TIME=21.15.39  SEQ=00006  CPU=0000  ASID=0014                    
  IEF196I  PSW AT TIME OF ERROR  070C1000   817C36FE  ILC 2  INTC 0D        
  IEF196I    NO ACTIVE MODULE FOUND                                         
  IEF196I    NAME=UNKNOWN                                                   
  IEF196I    DATA AT PSW  017C36F8 - 00181610  0A0D98DC  D0088910           
  IEF196I    AR/GR 0: 008F8680/84000000   1: 00000000/84378000              
  IEF196I          2: 00000000/00000000   3: 00000001/0000F703              
  ADY012I THE FOLLOWING DAE OPTIONS ARE IN EFFECT: 727                      
     START                                                                  
     SVCDUMP  = NOTIFY(3,30)  MATCH  UPDATE  SUPPRESSALL                    
     SYSMDUMP = MATCH  UPDATE                                               
     RECORDS  = 400                                                         
     DSN      = SYS1.DAE                                                    
  IEE855I DUMPDS COMMAND RESPONSE 729                                       
  DUMPDS COMMAND SYS1.DUMP DATA SET STATUS                                  

and this is related to the old configuration the ADCD System uses (IBM M3000 OS390).

With z/OS 2.3 it is officially not possible to define a z/OS configuration in BASIC mode. At least HCD does not support this and it is stated in the system documentation.

The hardware must be in LPAR mode. On the other hand under zPDT ADCD works properly with this config.

I guess the same File Protect Error might happen with z/VM 6.3. In the moment is not easy for me to switch back to z/VM 6.3 because I have done already a lot of config changes under 6.4 which are not in my old 6.3 system.

Fish-Git commented 5 years ago

The File Protect Error happens only once during IPL of z/OS 2.3 and I think it relates to the crash of the IOSAS address space:

Odd. IOSAS does not crash for me at all. I can IPL z/OS 2.3 natively and it does not crash. I can IPL z/OS 2.3 under z/VM 6.3 and it does not crash.

But then my z/OS dasds are all dedicated too (not shared).

I don't have any experience with such things, but I will try (when I find time) to define my z/OS 2.3 dasd (under z/VM 6.3) as being full pack minidisks and see whether that causes IOSAS to crash or not.

I wish I could help you more on this, but I might have to defer to this to someone more skilled at z/VM (like maybe Ivan?).

Or maybe it's some special z/OS configuration setting you're missing? I have no practical experience with z/OS either! Do you maybe have to tell z/OS that its dasds are being shared?

fbi-ranger commented 5 years ago

I was referring to that discussion:

https://hercules-390.yahoogroups.narkive.com/Is6GxlID/iosas-abend-378-1c-z-os-1-13

Same symptoms. It is interesting that your installation works but it may depend on the CP-Directory entry of the guest. Maybe you have other options. This is mine:

USER S0W1  OP 4G 8G BFG
*CRYPTO DOMAIN 14 APDEDICATED 5
   COMMAND TERMINAL BREAKIN MINIMAL HI ON
* ****************************************************************** *
* *  IPL definitions                                               * *
* ****************************************************************** *
   CPU 00 BASE
   CPU 01
*  IPL CMS
*  IPL 0A80 LOADP 0A82WSM1
   IPL 0A80 LOADP 0A82A0M1
   MACHINE ESA 3
   **OPTION MAINTCCW DEVINFO** 
   OPTION TODENABLE
   STDEVOPT LIBRARY CTL DASDSYS DATAMOVER
   CONSOLE  0009 3270
   DEDICATE 0700 0700
   DEDICATE 0710 0710
   SPECIAL  0701 3270
   SPECIAL  0702 3270
   SPECIAL  0703 3270
*  DEDICATE C001 C002
*  DEDICATE C101 C102
* ****************************************************************** *
* *  C3 - SYSTEM RESIDENCE AND PRODUCT DASD                        * *
* ****************************************************************** *
   MDISK 0A80 3390 DEVNO 3A80 MWV NBCREAD
   MDISK 0A81 3390 DEVNO 3A81 MWV NBCREAD
   MDISK 0A82 3390 DEVNO 3A82 MWV NBCREAD
   MDISK 0A83 3390 DEVNO 3A83 MWV NBCREAD 
   ... (etc) ...
Fish-Git commented 5 years ago

... but it may depend on the CP-Directory entry of the guest. Maybe you have other options.

Here's mine:

**********************************************************************
*                   FISHTEST z/OS
**********************************************************************

USER ZOS ZOS  6G 6G  BFG
MACHINE ESA 4
OPTION MAINTCCW LNKEXCLU QUICKDSP
STDEVOPT DASDSYS DATAMOVER LIBRARY CTL

CPU 00
CPU 01
CPU 02
CPU 03

IPL CMS
CONSOLE 01F 3270 C

MDISK 191 3390 009220 000010 M01RES MR
LINK MAINT 190 190 RR
LINK MAINT 19D 19D RR
LINK MAINT 19E 19E RR

NICDEF 0400 TYPE QDIO DEV 3 CHPID F0 LAN SYSTEM VSW1

SPOOL 00C 2501 A
SPOOL 00E 1403 A
SPOOL 00F 1403 A

DEDICATE 700 700
SPECIAL  701 3270
SPECIAL  702 3270
SPECIAL  703 3270

DEDICATE A80 A80
DEDICATE A81 A81
DEDICATE A82 A82
... (etc) ...

and then I manually logon to the ZOS userid (at terminal device 460) and use an IPLZOS EXEC to manually IPL my z/OS guest that looks like this:

/*-------------------------------------------------------------------*/
/*                         ZOS PROFILE EXEC                          */
/*-------------------------------------------------------------------*/

syscon  = '0700'    /* Master Console */

sysres  = '0A80'
syscat  = '0A82'

iplcode = 'WS'      /* Warm Start     */
msgsopt = 'M'       /* Verbose msgs?  */

/* Verify we have a Master Console */

'CP QUERY VIRTUAL' syscon

if rc \= 0 then
do
  save_rc = rc
  say 'ERROR: device' syscon '(Master Console) does not exist!'
  exit save_rc
end

/* Build LOADPARM from supplied variables */

loadparm = syscat||iplcode||msgsopt

/* 'cp sp cons for * class a' */
/* 'cp trace instr data b25f range 150461E.4 run cmd d t0.fff;base1' */

/* Isue the IPL command...

   NOTE: The following command should obviously NOT return!
*/
"PIPE CP IPL" sysres "LOADPARM" loadparm "| STEM cpmsg."

/* If we reach here then the IPL command obviously failed! */

say 'ERROR: IPL failed! rc='rc': "'cpmsg.1'"'
Fish-Git commented 5 years ago

Someone sent me an email off list that eludes to the problem being incorrect(?) (inaccurate?) RDC/RCD (Read Device Characteristics, Read Configuration Data) information being given to z/VM by Hercules:

The chain of events makes it very clear what is happening. VM is intercepting based on RDC/RCD information in shared minidisk mode and presenting the file protect error. This occurs because a bit setting is incorrect regarding the type of "real" DASD. What is needed is a copy of the real device RDC/RCD data from the lowest supported level of DASD per z/OS 2.3.

The failure is not the fault of VM, z/OS, nor Hercules. Only that the base level of DASD support for z/OS has changed.

I believe you have a z/PDT, yes? Would it be possible for you to provide a display of what the RDC and RCD CCWs return for your shared dasds for: 1) z/OS 2.3 native, 2) z/VM native, and 3) z/OS 2.3 under z/VM? There are probably some bits that the z/PDT is setting that Hercules is not.

Note too that because IBM does not document such bit settings, determining which bit(s) we need to set (or not set!) might be very difficult if not impossible. Nevertheless, it might be worth a try.

fbi-ranger commented 5 years ago

I made already that exercise when you look back to my comment here from January 7th 2019 you have already the differences explained. Tomorrow I can send you even more detailed comparison between Hercules and zPDT.

There is also a lot of differences since we use still 3990-6 CU while zPDT uses 2107 / DS8870.

Fish-Git commented 5 years ago

I made already that exercise when you look back to my comment here from January 7th 2019 you have already the differences explained.

Oops. You are correct. By bad. Sorry.

There is also a lot of differences since we use still 3990-6 CU while zPDT uses 2107 / DS8870.

For which there is unfortunately very little documentation. :(

It's difficult to provide proper support for devices when IBM refuses to make their functionality public. :(

Fish-Git commented 5 years ago

I don't have any experience with such things, but I will try (when I find time) to define my z/OS 2.3 dasd (under z/VM 6.3) as being full pack minidisks and see whether that causes IOSAS to crash or not.

I didn't notice that IOSAS crashed, but I am getting the File Protect errors in z/VM 6.3 now. Virtually every time.

More later as I continue my research...

fbi-ranger commented 5 years ago

Hi Fish,

Another thing I cannot really trace: When I shutdown z/VM after running it for lets say a day, Hercules does not stop any more. I can orderly shutdown the z/VM system and it dispatches the disabled wait. Then I do a 'quit'. The processor threads are ending then it keeps hanging before the summary of all the DASDs is presented.

Interesting is that the local 3270 sessions (NONSNA) get disconnect but Hercules still send messages at its console that the device is not available. Probably because the respective thread also ended.

The version is the latest I could pull from git:

HHC01413I Hercules version 4.2.0.0-SDL-gf4361bfa-modified (4.2.0.0)

The hercules process does not quit. Only a kill -9 and let it terminate.

In case I immediately a quit without starting the z/VM it quits correctly. Maybe it has to do with QETH driver. I added a QETH device EC00-EC02 and attached that to a virtual switch (LAYER3) of z/VM.

BR Florian

On Mon, 15 Jul 2019 at 21:33, Fish-Git notifications@github.com wrote:

I don't have any experience with such things, but I will try (when I find time) to define my z/OS 2.3 dasd (under z/VM 6.3) as being full pack minidisks and see whether that causes IOSAS to crash or not.

I didn't notice that IOSAS crashed, but I am getting the File Protect errors in z/VM 6.3 now. Virtually every time.

More later as I continue my research...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SDL-Hercules-390/hyperion/issues/166?email_source=notifications&email_token=ABESU6BAAFP3UMMLVZ7DXCTP7TGH3A5CNFSM4GMEUMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ6XMSY#issuecomment-511538763, or mute the thread https://github.com/notifications/unsubscribe-auth/ABESU6BZY6V3K5SSVXP7Q3LP7TGH3ANCNFSM4GMEUMQA .

-- Best regards

Florian Bilek

Fish-Git commented 5 years ago

The hercules process does not quit. Only a kill -9 and let it terminate.

Please create a new GitHub Issue for this so we can look into it. Thanks.

(I don't want to pursue this new shutdown issue in this thread; I'd like to concentrate on the File Protect problem; your new shutdown/exit/quit problem needs to be in its own GitHub Issue/thread; Thanks)

fbi-ranger commented 5 years ago

Yes, OK. I will created a new issue, when I can reproduce the hang.

Fish-Git notifications@github.com schrieb am Mo., 15. Juli 2019, 22:51:

The hercules process does not quit. Only a kill -9 and let it terminate.

Please create a new GitHub Issue for this so we can look into it. Thanks.

(I don't want to pursue this new shutdown issue in this thread; I'd like to concentrate on the File Protect problem; your new shutdown/exit/quit needs to be in its own GitHub Issue thread; thanks)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SDL-Hercules-390/hyperion/issues/166?email_source=notifications&email_token=ABESU6FCC5AWOA6DRFQDVB3P7TPMPA5CNFSM4GMEUMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ65Z7Q#issuecomment-511565054, or mute the thread https://github.com/notifications/unsubscribe-auth/ABESU6APDNRNCK4OK4CGNITP7TPMPANCNFSM4GMEUMQA .

fbi-ranger commented 5 years ago

An additional hint:

The File Protect error in the described way happens also sometimes during normal operation. In this case a new SYS1.IODFnn dataset was created. When HCD reads the WORK IODF file, the same FILE PROTECT ERROR is generated (but with another seek address of course).

Fish-Git commented 5 years ago

@fbi-ranger Florian,

After researching this issue for the past several days I have committed what I believe to be is the proper(?) fix for this problem.

Please do a pull to merge in my commit, rebuild Hercules and try your tests again.

Hopefully your problem should be fixed now. Please let me know if it still isn't fixed or new problems arise. Thanks!

fbi-ranger commented 5 years ago

Fish, Thank you very much indeed, the FILE PROTECT issue during IPL is not shown anymore. Great job, Fish.

Florian