hercules-390 / hyperion

Hercules 390
Other
246 stars 69 forks source link

Cannot use more than one processor in VM/SP #261

Closed ivan-w closed 6 years ago

ivan-w commented 6 years ago

While trying to generate the S/370 hypervisor in MP or AP mode, it fails to IPL if more than 1 processor is present in the configuration. Apparently, I/O interrupts are being presented to the wrong processor(s) and/or are not generating the interrupt on the correct processor (the Channel Set on which the I/O devices are connected).

"ipending" shows as follows :

HHC01603I ipending
HHC00850I Processor CP00: CPUint=40000001 (State:40000001)&(Mask:C000000B)
HHC00851I Processor CP00: interrupt not pending
HHC00852I Processor CP00: I/O interrupt pending
HHC00853I Processor CP00: clock comparator not pending
HHC00854I Processor CP00: CPU timer not pending
HHC00855I Processor CP00: interval timer not pending
HHC00856I Processor CP00: ECPS vtimer not pending
HHC00857I Processor CP00: external call not pending
HHC00858I Processor CP00: emergency signal not pending
HHC00859I Processor CP00: machine check interrupt not pending
HHC00860I Processor CP00: service signal not pending
HHC00861I Processor CP00: mainlock held: no
HHC00862I Processor CP00: intlock held: no
HHC00863I Processor CP00: waiting for intlock: no
HHC00864I Processor CP00: lock not held
HHC00865I Processor CP00: connected to channelset 0000
HHC00866I Processor CP00: state STARTED
HHC00867I Processor CP00: instcount 308814
HHC00868I Processor CP00: siocount 225
HHC00869I Processor CP00: psw 020A000000000004
HHC00850I Processor CP01: CPUint=00000001 (State:40000001)&(Mask:8200000B)
HHC00851I Processor CP01: interrupt not pending
HHC00852I Processor CP01: I/O interrupt pending
HHC00853I Processor CP01: clock comparator not pending
HHC00854I Processor CP01: CPU timer not pending
HHC00855I Processor CP01: interval timer not pending
HHC00856I Processor CP01: ECPS vtimer not pending
HHC00857I Processor CP01: external call not pending
HHC00858I Processor CP01: emergency signal not pending
HHC00859I Processor CP01: machine check interrupt not pending
HHC00860I Processor CP01: service signal not pending
HHC00861I Processor CP01: mainlock held: no
HHC00862I Processor CP01: intlock held: no
HHC00863I Processor CP01: waiting for intlock: no
HHC00864I Processor CP01: lock not held
HHC00865I Processor CP01: connected to channelset 0001
HHC00866I Processor CP01: state STARTED
HHC00867I Processor CP01: instcount 147643
HHC00868I Processor CP01: siocount 0
HHC00869I Processor CP01: psw 010E000000000000
HHC00815I Processors CP02 through CP07 are offline
HHC00870I config mask 0000000000000001 started mask 0000000000000003 waiting mask 0000000000000000
HHC00871I syncbc mask 00007F836B7AC3D0 (null)
HHC00872I signaling facility not busy
HHC00873I TOD lock not held
HHC00874I mainlock not held; owner ffff
HHC00875I intlock not held; owner ffff
HHC00876I ioq lock not held
HHC00883I Channel Report queue: (NULL)
HHC00880I device 0:0010: status I/O pending
HHC00880I device 0:02D0: status I/O pending
HHC00881I I/O interrupt queue:
HHC00882I device 0:02D0:  normal,  pri ISC 00 CSS 00 CU 00
HHC00882I device 0:0010:  normal,  pri ISC 00 CSS 00 CU 00

CP01 should NOT have any I/O pending (there is no device on channel set 0001) - and CP00 is obviously not getting the interrupt either (although the mask and state indicates the interrupt should occur).

Also the "config mask" and "waiting mask" look suspicious !

jphartmann commented 6 years ago

There is a lot that is weird here. Look at the PSW for CP0. It is in a wait state with external interrupts disabled; which is higly unusual. I/O mask enabled. If message 851indicates the summary status, that is strange too. The interrupts are queued for channel set 0, right? PSW address 4 is also unusual.

Why is the PSW not shown for CP1?

ivan-w commented 6 years ago

Yes.. Channel Set 0 (that's what the leading 0 stands for when in S/370 mode)... But I agree, a bunch of stuff that look very weird in the "ipending" output !

ivan-w commented 6 years ago

Ok.. Update on that... I managed to get it to work eventually.. I think I might have messed during CP nucleus generation at some point... I still have a problem (but it's transient) : I get a bunch of :

TURN ON THE NON-IPL CPU'S INTERVAL TIMER

I'll look why I'm getting that ! Anyway, I'll close this issue (User Error)

jphartmann commented 6 years ago

During IPL it would loop issuing "turn on the interval timer" if e.g., you IPL in a virtual machine where you have SET TIMER OFF. Presumably your AP is not getting enough "juice".

ivan-w commented 6 years ago

Actually I'm in MP mode... But the CPU may be a bit too fast - Making CP think that the interval timer isn't moving ! Eventually the system IPL continues - and everything seems to then proceed normally. So right now, I'm thinking it might be an artifact of CP not being designed to run a CPU that is about 50 times as fast as a 4381 P03 !

I'm still encountering some times when the IPL gets stuck... (and the weird ipending) - but it's not systematic... But once the IPL completes, everything seems to work normally (but there is no load on the guest).

ivan-w commented 6 years ago

update Actually, the behavior seems to be fairly dependent on the system type.. Setting it to "3090" seems to eleviate the "Turn on the Interval timer" thing.... It still doesn't fix the occasional failure to IPL... But it may be something else entirely !