hercules-390 / hyperion

Hercules 390
Other
248 stars 67 forks source link

ECPS:VM causing 2nd level CP abends when IPLing guest MVS at 3rd level #189

Closed wably closed 7 years ago

wably commented 7 years ago

The ECPS support is causing 2nd level CP (VM/370) to experience a PRG017 failure and restart moments after attempting to IPL MVS 3.8 at 3rd level. The PRG017 is caused by the SVC assist loading the SVC new PSW from MVS's page 0 instead of 2nd level CP's page 0. CP program checks because the new PSW loaded is not correct for its environment; the new PSW has DAT on as well as an invalid virtual instruction address as known to CP. This results in a page fault which CP cannot handle and results in the PRG017 abend. The solution is below; for more details please continue reading.

The SVC assist is loading the incorrect SVC new PSW because an LCTL 1 instruction issued shortly before by MVS was also assisted and caused MVS's CR1 content to be incorrectly placed into the CR1 contents of 2nd level CP. When the SVC assist code attempts to get the internal pointer to page 0 to fetch the new PSW it gets the internal pointer for MVS page 0 (because that's what CR1 points to) and therefore the wrong SVC new PSW.

All of this is occurring because of a missing check during instruction assist to see if the virtual PSW is in problem state. When instruction assist is operating, the real machine must be in problem state by definition of the assist. However, at second level (be it any CMS user or guest operating system) any execution of a privileged instruction must be issued in virtual supervisor state. No problem there. But for a guest operating system such as VM at second level, the 2nd level CP will dispatch 3rd level users (CMS or other guest systems) in virtual problem state so that 2nd level CP can field instruction simulation requests for priv ops. Because the ECPS is not checking for virtual problem state in the virtual PSW, ECPS tries to assist the instruction. This results in registers and/or control registers being updated for 2nd level CP even though a third level machine issued the priv op.

This is difficult to explain easily but suffice to say that the assist is designed to help the real machine and 1st level CP by avoiding simulating the supported privileged instructions by 2nd level users. The assist is not supposed to help 3rd level or higher users; these must be fielded by the operating system at the next lowest level.

This issue is corrected by adding a quick check to see if the 2nd level user is in virtual problem state. If so, return control to CP and do not do the assist. This check below is added to SASSIST_PROLOG in ecpsvm.c :

/ 2017-01-23 Reject if Virtual PSW is in problem state / \ / All instruction assists should be rejected if VPSW is in problem state / \ / and be reflected back to CP for handling. This affects 2nd level VM with / \ / 3rd level guests. / \ if(CR6 & ECPSVM_CR6_VIRTPROB) \ { \ DEBUG_SASSISTX(_instname,WRMSG(HHC90000, "D", "SASSIST "#_instname" reject : Virtual problem state")); \ return(1); \ } \ / End of 2017-01-23 / \

At present, ecpsvm.c does contain a virtual problem state check for the LPSW and SSM assists. But the other assisted instructions are not checked. This fix adds the check for all instructions by adding it to the prolog code and removes the individual checks in the LPSW and SSM assist code.

A .zip file is attached that contains this fix to ecpsvm.c as well as the other recent updates to ECPS:VM for issues #186, #187, and #188.

ecpsfix.zip

ivan-w commented 7 years ago

Give me some time to think it over !

--Ivan

On 1/24/2017 6:53 PM, wably wrote:

The ECPS support is causing 2nd level CP (VM/370) to experience a PRG017 failure and restart moments after attempting to IPL MVS 3.8 at 3rd level. The PRG017 is caused by the SVC assist loading the SVC new PSW from MVS's page 0 instead of 2nd level CP's page 0. CP program checks because the new PSW loaded is not correct for its environment; the new PSW has DAT on as well as an invalid virtual instruction address as known to CP. This results in a page fault which CP cannot handle and results in the PRG017 abend. The solution is below; for more details please continue reading.

The SVC assist is loading the incorrect SVC new PSW because an LCTL 1 instruction issued shortly before by MVS was also assisted and caused MVS's CR1 content to be incorrectly placed into the CR1 contents of 2nd level CP. When the SVC assist code attempts to get the internal pointer to page 0 to fetch the new PSW it gets the internal pointer for MVS page 0 (because that's what CR1 points to) and therefore the wrong SVC new PSW.

All of this is occurring because of a missing check during instruction assist to see if the virtual PSW is in problem state. When instruction assist is operating, the real machine must be in problem state by definition of the assist. However, at second level (be it any CMS user or guest operating system) any execution of a privileged instruction must be issued in virtual supervisor state. No problem there. But for a guest operating system such as VM at second level, the 2nd level CP will dispatch 3rd level users (CMS or other guest systems) in virtual problem state so that 2nd level CP can field instruction simulation requests for priv ops. Because the ECPS is not checking for virtual problem state in the virtual PSW, ECPS tries to assist the instruction. This results in registers and/or control registers being updated for 2nd level CP even though a third level machine issued the priv op.

This is difficult to explain easily but suffice to say that the assist is designed to help the real machine and 1st level CP by avoiding simulating the supported privileged instructions by 2nd level users. The assist is not supposed to help 3rd level or higher users; these must be fielded by the operating system at the next lowest level.

This issue is corrected by adding a quick check to see if the 2nd level user is in virtual problem state. If so, return control to CP and do not do the assist. This check below is added to SASSIST_PROLOG in ecpsvm.c :

/ 2017-01-23 Reject if Virtual PSW is in problem state // // All instruction assists should be rejected if VPSW is in problem state // // and be reflected back to CP for handling. This affects 2nd level VM with // // 3rd level guests. // if(CR6 & ECPSVM_CR6_VIRTPROB) { DEBUG_SASSISTX(_instname,WRMSG(HHC90000, "D", "SASSIST "#_instname" reject : Virtual problem state")); return(1); } // End of 2017-01-23 / \

At present, ecpsvm.c does contain a virtual problem state check for the LPSW and SSM assists. But the other assisted instructions are not checked. This fix adds the check for all instructions by adding it to the prolog code and removes the individual checks in the LPSW and SSM assist code.

A .zip file is attached that contains this fix to ecpsvm.c as well as the other recent updates to ECPS:VM for issues #186 https://github.com/hercules-390/hyperion/issues/186, #187 https://github.com/hercules-390/hyperion/issues/187, and #188 https://github.com/hercules-390/hyperion/issues/188.

ecpsfix.zip https://github.com/hercules-390/hyperion/files/727485/ecpsfix.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hercules-390/hyperion/issues/189, or mute the thread https://github.com/notifications/unsubscribe-auth/ABjMWzXbdNDFTJMcHjUm2h8ebRtibd7kks5rVjqQgaJpZM4LslEj.

ivan-w commented 7 years ago

Basically, the ECPS:VM SVC assist shouldn't assist an SVC which changes anything in the virtual PSW except for the IA (because it's basically a branch).

--Ivan

On 1/24/2017 6:53 PM, wably wrote:

The ECPS support is causing 2nd level CP (VM/370) to experience a PRG017 failure and restart moments after attempting to IPL MVS 3.8 at 3rd level. The PRG017 is caused by the SVC assist loading the SVC new PSW from MVS's page 0 instead of 2nd level CP's page 0. CP program checks because the new PSW loaded is not correct for its environment; the new PSW has DAT on as well as an invalid virtual instruction address as known to CP. This results in a page fault which CP cannot handle and results in the PRG017 abend. The solution is below; for more details please continue reading.

The SVC assist is loading the incorrect SVC new PSW because an LCTL 1 instruction issued shortly before by MVS was also assisted and caused MVS's CR1 content to be incorrectly placed into the CR1 contents of 2nd level CP. When the SVC assist code attempts to get the internal pointer to page 0 to fetch the new PSW it gets the internal pointer for MVS page 0 (because that's what CR1 points to) and therefore the wrong SVC new PSW.

All of this is occurring because of a missing check during instruction assist to see if the virtual PSW is in problem state. When instruction assist is operating, the real machine must be in problem state by definition of the assist. However, at second level (be it any CMS user or guest operating system) any execution of a privileged instruction must be issued in virtual supervisor state. No problem there. But for a guest operating system such as VM at second level, the 2nd level CP will dispatch 3rd level users (CMS or other guest systems) in virtual problem state so that 2nd level CP can field instruction simulation requests for priv ops. Because the ECPS is not checking for virtual problem state in the virtual PSW, ECPS tries to assist the instruction. This results in registers and/or control registers being updated for 2nd level CP even though a third level machine issued the priv op.

This is difficult to explain easily but suffice to say that the assist is designed to help the real machine and 1st level CP by avoiding simulating the supported privileged instructions by 2nd level users. The assist is not supposed to help 3rd level or higher users; these must be fielded by the operating system at the next lowest level.

This issue is corrected by adding a quick check to see if the 2nd level user is in virtual problem state. If so, return control to CP and do not do the assist. This check below is added to SASSIST_PROLOG in ecpsvm.c :

/ 2017-01-23 Reject if Virtual PSW is in problem state // // All instruction assists should be rejected if VPSW is in problem state // // and be reflected back to CP for handling. This affects 2nd level VM with // // 3rd level guests. // if(CR6 & ECPSVM_CR6_VIRTPROB) { DEBUG_SASSISTX(_instname,WRMSG(HHC90000, "D", "SASSIST "#_instname" reject : Virtual problem state")); return(1); } // End of 2017-01-23 / \

At present, ecpsvm.c does contain a virtual problem state check for the LPSW and SSM assists. But the other assisted instructions are not checked. This fix adds the check for all instructions by adding it to the prolog code and removes the individual checks in the LPSW and SSM assist code.

A .zip file is attached that contains this fix to ecpsvm.c as well as the other recent updates to ECPS:VM for issues #186 https://github.com/hercules-390/hyperion/issues/186, #187 https://github.com/hercules-390/hyperion/issues/187, and #188 https://github.com/hercules-390/hyperion/issues/188.

ecpsfix.zip https://github.com/hercules-390/hyperion/files/727485/ecpsfix.zip

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hercules-390/hyperion/issues/189, or mute the thread https://github.com/notifications/unsubscribe-auth/ABjMWzXbdNDFTJMcHjUm2h8ebRtibd7kks5rVjqQgaJpZM4LslEj.

wably commented 7 years ago

The SVC assist does do the checks for changes in the system mask that would have ended the assist if the mask were changed. However, the old PSW had DAT on as well, so there was no transition here. Going from enabled to disabled is ok (going from disabled to enabled is checked). Here are the PSWs:

HHC90000D DBG: SASSIST SVC NEW VIRT 040C000000041DA2 HHC90000D DBG: SASSIST SVC OLD VIRT 070D0000000810B0

Note that a new PSW beginning with 040C should never be seen as a new PSW in CP!

wably commented 7 years ago

closing; fixed by commit of 3/4/2017