open-simh / simh

The Open SIMH simulators package
https://opensimh.org/
Other
473 stars 89 forks source link

PDP11: RP11: Interrupt on IE+RESET+GO #389

Closed al20878 closed 3 months ago

al20878 commented 3 months ago

Recent analysis of the 2.9BSD kernel revealed that RP11 was expected to interrupt on control RESET function if IE bit was also set. Documentation was not very clear of the fact, saying in one place that RESET+GO does not interrupt (which is not contradictory with the above because it does not mention IE).

In other place, however, it says that IE always causes interrupt when DONE is asserted. Thus, since RESET does assert DONE, an interrupt should be posted if IE is set. The autoconfig binary from 2.9BSD uses this feature of RP11 to check the presence of the controller.

Formerly RESET was always clearing RPCS with DONE unconditionally, and that reset IE as well. This patch makes sure that the IE bit is preserved, and if set, it posts an interrupt when RESET asserts DONE.

al20878 commented 3 months ago

Before the patch:

Berkeley UNIX (Rev. 2.9.1) Sun Nov 20 14:55:50 PST 1983
mem = 144064
xp0: drive type 0 unrecognized
xp1: drive type 0 unrecognized
xp2: drive type 0 unrecognized
xp3: drive type 0 unrecognized

CONFIGURE SYSTEM:
xp ? csr 176700 vector 254 interrupt vector already in use
rk 0 csr 177400 vector 220 attached
hk 0 csr 177440 vector 210 attached
rl 0 csr 174400 vector 160 attached
rp ? csr 176700 vector 254 didn't interrupt
ht 0 csr 172440 vector 224 skipped:  No CSR
tm 0 csr 172520 vector 224 attached
ts 0 csr 172520 vector 224 interrupt vector already in use
dh ? csr 160020 vector 370 skipped:  No CSR
dm ? csr 170500 vector 360 skipped:  No autoconfig routines
dz ? csr 160110 vector 320 interrupt vector wrong
dz ? csr 160110 vector 320 interrupt vector wrong
dn 0 csr 175200 vector 300 skipped:  No autoconfig routines
vp ? csr 177500 vector 174 skipped:  No autoconfig routines
lp 0 csr 177514 vector 200 attached
Erase=^?, kill=^U, intr=^C
#

After the patch:

sim> b rr
>boot

70Boot
: rp(0,0)rpunix

Berkeley UNIX (Rev. 2.9.1) Sun Nov 20 14:55:50 PST 1983
mem = 144064
xp0: drive type 0 unrecognized
xp1: drive type 0 unrecognized
xp2: drive type 0 unrecognized
xp3: drive type 0 unrecognized

CONFIGURE SYSTEM:
xp ? csr 176700 vector 254 interrupt vector already in use
rk 0 csr 177400 vector 220 attached
hk 0 csr 177440 vector 210 attached
rl 0 csr 174400 vector 160 attached
rp 0 csr 176710 vector 254 attached
ht 0 csr 172440 vector 224 skipped:  No CSR
tm 0 csr 172520 vector 224 attached
ts 0 csr 172520 vector 224 interrupt vector already in use
dh ? csr 160020 vector 370 skipped:  No CSR
dm ? csr 170500 vector 360 skipped:  No autoconfig routines
dz ? csr 160110 vector 320 interrupt vector wrong
dz ? csr 160110 vector 320 interrupt vector wrong
dn 0 csr 175200 vector 300 skipped:  No autoconfig routines
vp ? csr 177500 vector 174 skipped:  No autoconfig routines
lp 0 csr 177514 vector 200 attached
Erase=^?, kill=^U, intr=^C

Note the change in the rp line.

Surprisingly autoconfig had its configuration file at /etc/dtab in error, specifying RP11's CSR at 176700 (first report). Changing that with the correct value 176710 did not help remove the error, and only with the fix inside pdp11_rr.c it was all clear (second report).

It's interesting to note that since autoconfig is a standalone binary (and not part of the kernel), the rp "misbehavior" did not affect the kernel or access to the information on disk, because the kernel itself does not use the IE+RESET+GO->Interrupt feature. Usually when code needs to reset the controller, they just send RESET+GO to CSR, leaving the IE bit zero. Also, internally the kernel was using the correct CSR address assignment at 176710, which made the kernel boot, and the system to operate correctly.

pkoning2 commented 3 months ago

A lot of devices clear IE as part of the reset function. I take it that's not the case for the RP11? It would be nice to confirm this using the controller schematics.

al20878 commented 3 months ago

It would be nice to confirm this using the controller schematics.

TBH, I am not very good at reading those, and especially when the CSR drawings exactly are (ironically) the worst quality scans of all other pages in there:

http://www.bitsavers.org/pdf/dec/unibus/RP11_schem_Sep74.pdf http://www.bitsavers.org/pdf/dec/unibus/RP11-C_schemMay1973.pdf

But I can read source code quite well, and what I saw 2.9BSD was doing, confirms my finding that if software was writing IE+RESET+GO into the CSR, the controller was expected to interrupt.

Usually (all normal) software writes RESET+GO, which resets IE (because IE is 0 in such a command), and as a result, clears the interrupt, if any was pending.

Besides, this is the exact quote from the RP11-C documentation http://www.bitsavers.org/pdf/dec/unibus/DEC-11-HRPCA-C-D_RP11-C_Maint_Aug74.pdf

4.8
4.8.1 Done Interrupt
To enable the Done Interrupt logic, the Interrupt on Done Enable (IDE) bit in the RPCS must be set under
program control. Whenever any operation is successfully completed, or if a condition that invalidates that
operation is detected, the program wants to be interrupted. All the above conditions are ORed to produce
STR NOT READY H.  STR NOT READY H clears the NOT READY flip-flop (drawing D25). NOT READY (0) H
and IDE (1) H are applied to the M7821 Interrupt Control module to assert BUS BR 5 L.

(that's on top of page 4-22, I had to manually retype it because it's scanned document with text not copy/paste-able.)

Anyways, the operative word in the above is "whenever", and that IE is set by the program. IE+RESET+GO are set by the program, and should cause an interrupt. Conversely (and way more wide-spread used) RESET+GO clears the interrupt (and the IE bit, obviously).

IE and RESET(+GO) share the same byte, so you can't set a function to run without updating IE at the same time (minimally with a byte instruction). But OTOH you can update IE without causing any function, e.g. IE+RESET [no GO] -- with RP11-C one has to be careful, though, as not to change the MX bit (memory extension A<16:17>, also tucked in there) while an I/O is in progress, because that would mess up memory locations accessed by that operation.

BTW, I re-ran all XXDP, X11, DOS-11, RSX-11 tests that I had at my disposal (in addition to running 2.9BSD now, and doing all sorts of wild things like setting up file systems etc without a single hitch), and I can confirm that nothing appears to be broken.

al20878 commented 3 months ago

Are there still any concerns? Can this please be pushed, if not?

al20878 commented 3 months ago

Thanks, @pkoning2 !

A lot of devices clear IE as part of the reset function.

Looking at 2.9BSD autoconfig code I'd say that quite a few devices were expected to post an interrupt when IE+RESET+GO was sent to the CSR:

Massbus RP04/5/6/7 and RM02/03/05 disks; RK disks; TMSCP tapes...

So it wasn't unique for the RP11 controller, actually.

P.S. Bus reset clears IE, though, like you mentioned.