Little issue with I/O subsystem

ghost commented 10 years ago

(from an email Ivan sent on Thu 1/30/2014)

Trying to reinstall VM/SP5 from the starter system, I'm running into a bit of an issue.

IPL of ICKDSF works fine.. Then I IPL the next file (which is DMKFMT)...

DMKFMT uses the 3CARD Loader (DMKBSL) which is a bit of a dumb program (but it resides in 3 cards - including the IPL card).. So its processing is very simplistic :

(Remember, the whole programs holds in 160 bytes)...

TIO the IPL Device
IF CC=3 LPSW disabled wait state
IF CC=2 retry
IF CC=1 Process the record (TXT record move the code and back to top, END card, jump to loaded program)
IF CC=0 SIO to read the next card/tape record
Back to the top

So we get a nice TIO loop... Fine...

Only with current hercules/hyperion I'm winding in situation in which after issuing the SIO, TIO will eventually give a CC=0, and when SIO is done again, it is accepted... Leading to various griefs:

Mangled tape buffers (we have now multiple threads trying to read the same tape)...
Deadlocks leading to machine checks (by the CPU watchdog) - and then unable to even quit herc..

Note about the config :

S/370 architecture
1 CP
16 MB

My host has multiple CPUs/Cores...

My belief is that there is a subtle race condition somewhere... (possibly in channel.c/execute_ccw_chain or schedule_ioq)

ghost commented 10 years ago

(From an email Ivan sent on Fri 1/31/2014)

I think I found the problem....

The issue lies with the interactions between dev->busy and dev->ioactive...

dev->ioactive is being used for 2 things

It indicates that an I/O is in progress... But for shared devices, it can also indicate that an I/O exists and has been initiated by another hercules instance.

So some of the code (testio, startio in channel.c for example) checks to see if dev->busy is set AND if the I/O is a local I/O... and will allow an I/O to go forward if dev->busy is set but dev->ioactive is not...

Unfortunately, dev->busy can also mean that execute_ccw_chain isn't done yet... So we're facing this:

CPU initiates I/O - Preliminary CCW checking is done - CPU is released and I/O scheduled (through the dev->s370start / SCSW2_ACSUSPEND trick).
CPU is released and goes into a TIO loop
I/O is performed in I/O thread - and execute_ccw_chain presents the status WHICH CLEARS dev->ioactive (DEV_SYS_NONE)
CPU Performs TIO and gets CC=1 Because dev->busy==1 and dev->ioactive==0 and status is pending
CPU Performs TIO again with CC=0
CPU Performs SIO, sets dev->busy to 1 and goes to execute_ccw_chain for the next ccw chain validation.. But in the course, dev->busy gets set to 0 by the previous I/O thread which wasn't done yet because execute_ccw_chain WILL release the device lock... So the I/O is scheduled in the device with dev->busy set to 0...
CPU Performs TIO with CC=0 (because dev->busy is 0)
CPU Performs SIO - requeues the same dev block to the I/O queue.... And all hell breaks lose... IO Queue pointers are barfed... multiple I/O threads are acting on the same device backend... deadlocks ensue, etc...

This may be a bit of a mess to sort out..

ghost commented 10 years ago

(from an email Ivan sent on Fri 1/31/2014)

This portion of code might be the culprit : (in channel.c at the end of execute_ccw_chain) :

/* Present the interrupt and return */
release_lock (&dev->lock); /* 1 */
queue_io_interrupt_and_update_status(dev); /* 2 */
execute_ccw_chain_return(NULL); /* 3 */

1: We release the lock.. Ok.. 2: Inside this, we get the lock back, and set dev->ioactive to DEV_SYS_NONE.. and release the lock again 3: Inside this (it's a macro) we get the lock back again, set dev->busy to 0, release the lock and return to caller...

Between 2 and 3 there is a window of opportunity to initiate another I/O and mess up dev->busy!

IOT: *2 gets lock, sets ioactive to DEV_SYS_NONE, releases lock..
CPU: does SIO, gets lock, sets busy to 1, releases lock..
IOT: *3 gets lock, sets busy to 0 (ooops!), releases lock and returns.

ghost commented 10 years ago

(from an email Ivan sent on Wed 2/5/2014)

In S/370 mode, the I/O is created in a suspended state (with the dev->s370devstart flag set) in order to account for the synchronous portion of S/370 SIO (which requires initial CCW validation before completing the SIO instruction). Once CCW validation is complete, the I/O is queued to be processed by a device thread (for non SYNCIO situations) with the CCW released and control is given back to the CPU instruction interpreter.

But that's not really the issue !

I find the path that leads to the situation described pretty straight forward because dev->ioactive not being set to DEV_SYS_LOCAL leads to dev->busy being ignored by channel.c[testio/startio] and because dev->ioactive and dev->busy are manipulated independently with the dev->lock being released between the two being manipulated (at the end of execute_ccw_chain).

The test case is simple : (I/O Interrupts need to be disabled)

DEVICE EQU <CUU of your test device>
   BALR 12,0
   USING *,12
   LA 1,CCWCHAIN
   ST 1,CAW
TIOLOOP DS 0H
   TIO DEVICE
   BNZ  TIOLOOP
   SIO DEVICE
   B   TIOLOOP

CCWCHAIN is any valid CCW chain that is neither a single NO-OP, nor a SENSE nor a SENSEID (Since those can be treated synchronously with SYNCIO).. For example, REWIND on a AWS tape will do just fine... The amount of time it takes to service the I/O is not really important.. What's important is that it be serviced by a Device Thread.

You will see at some point (with CCW trace turned on):

SIO Giving CC=0
TIO Giving CC=0 right away (this should never happen when the CPU is disabled for I/O interrupts.. The only possible answers are either CC=2 (busy), CC=1 (CSW stored) or CC=3 (Device went offline))
SIO Giving CC=0 We're in bad shape now... 2 I/Os are overlapping...

Hilarity ensues at this point... the device threads gets very confused... 'quit' won't respond and your only option is usually to kill the process altogether (probably some deadlock between iointlock, ioqlock and dev->lock)... But these are only a symptom of something that shouldn't be happening in the first place.

PS: I/O thread priority will have an impact on how often the error shows up, or if it shows at all.. It may never show on a specific rig if the I/O thread has a higher priority than the CPU thread.. or if the Operating System thread dispatching mechanism prevents the problem from showing (especially regarding how lock contentions are handled). But the error exists nonetheless...

ghost commented 10 years ago

Additional info by Fish:

Since the issue seems to surround dev->ioactive (DEV_SYS_LOCAL, DEV_SYS_NONE, etc) which is only used by Greg's Shared Device Server logic, I am going to be adding code to support building Hercules with or without OPTION_SHARED_DEVICES #defined.

Once that code is in place (new HQA Build Scenarios 12 and 13) it might be possible to simplify the problematic channel code for the case when OPTION_SHARED_DEVICES is not #defined such that the problem no longer exists (i.e. we might be able to completely close the bug's window of opportunity for #undef OPTION_SHARED_DEVICES builds).

This will allow us to temporarily get past (woirkaround) this bug thereby letting us concetrate at our liesure on making whatever changes are necessary to make Greg's Shared Dasd support work correctly with Mark's new channel code.

Fish-Git commented 10 years ago

I've had a chance to work on this issue and it's being caused by Greg's "Shared Device Server" feature (OPTION_SHARED_DEVICES).

Many modifications (fixes) were made to Hyperion's channel subsystem logic a while ago to provide a truer more accurate emulation of System 370, ESA and Z subsystem architecture, and the current Shared Device Server implementation does not conform to the new (more accurate) architecture.

Therefore, until we (i.e. "someone", perhaps Greg himself with coordinated help from Mark) can manage to make the changes necessary to allow the Shared Device Server code to work correctly with our more accurate (compliant) channel subsystem architecture it needs to be disabled.

I will very shortly be making a series of commits that does this which should (hopefully) properly resolve your issue.

Thanks.

"Fish" (David B. Trout) Software Development Laboratories http://www.softdevlabs.com (TEMP): david.b.trout@gmail.com

Fish-Git commented 10 years ago

Commit 6cb172649de531bc02a9e6024e43cd7266aedb76 fixes #27.

hercules-390 / hyperion

Little issue with I/O subsystem #27