PDP-10 / its

Incompatible Timesharing System
Other
864 stars 83 forks source link

PAPSAV #1029

Closed larsbrinkhoff closed 5 years ago

larsbrinkhoff commented 6 years ago

CHANNA; RAKASH PAPSAV

Originally DM demon as SYS; ATSIGN PAPSAV?

larsbrinkhoff commented 6 years ago

Supposedly spies on T00, i.e. the system console.

This would be very good to have for checking the system log. SYSMSG doesn't display everything.

larsbrinkhoff commented 6 years ago

Binary code is 373 words. Should be able to reconstruct.

eswenson1 commented 5 years ago

This program appears to share some code with SYSEN1; SYSMSG >. There some strings and symbols in common, such as the string "THERE IS A HOLE IN MEMORY" and the starting address of sysmsg.

eswenson1 commented 5 years ago

I've reconstructed the source for this. It will be in SYSEN3;PAPSAV 1. Not sure what version number to use.

eswenson1 commented 5 years ago

@larsbrinkhoff @atsampson I found something really interesting -- to me, at least. I have a version of PAPSAV that I created that uses SYSENG; CALRET >. I haven't yet gotten it to match byte-for-byte with ATSIGN PAPSAV (and the hand-coded version of PAPSAV that I've already merged). I'm working on that, so more on that later.

However, as long as I was using all the cool macros in CALRET, I thought I'd use the cool macros in SYSMSG that do the address patching. As background, both SYSMSG and PAPSAV contain a table of ITS symbols and patch locations (instructions to patch with the ITS address that corresponds to the symbol). On startup both programs grab the ITS version and the address of a user table (by evaluating SQUOZE symbols in ITS) and records these. Then, it fetches a bunch of addresses from ITS (e.g. the system console TTY buffer pointer) and patches the code with these symbols. For example, it patches movei a,0 to be movei a,.

Now, my reconstructed code does this patching with code I reverse engineered from the existing ATSIGN PAPSAV binary -- and assembling my code results in an exact match of the binary. The SYSMSG code, on the other hand, uses some nice macros to create the patch table. Rather than having to hand-code the symbol and the address to patch, the SYSMSG code just creates (using a macro) the table of symbols as well as relocation data that allows a more automatic patching of the code. The ITS symbols are referenced by the instructions to be patched, and the value they have (from the macro) allows the patching code to know which ITS symbol address to patch there. It is pretty clever, actually.

So my inclination was to use both CALRET and these macros from SYSMSG to do the similar/virtually identical functionality in PAPSAV. The issue is that these SYSMSG matches generate different code -- specifically they generate literals in the constants section of the binary that aren't present in ATSIGN PAPSAV (original or assembled with my hand-coded version). These literals throw off all the other literals, and result in a non-matching binary.

So my conclusion is this: The original PAPSAV used CALRET macros, but did NOT use the SYSMSG macros for the patch table (also the patching code is slightly different). It was done manually. Perhaps someone wrote SYSMSG after PAPSAV and decided to be more clever. I'm going to abandon using those SYSMSG macros and hand-code the patch table as I did in my original reconstructed code.

To demonstrate the difference, consider the following:

Disassembling ATSIGN PAPSAV at the address NOGOOD shows this:

nogood/   MOVEI T1,NOGOOD+46
NOGOOD+1/   PUSHJ P,TYPE
NOGOOD+2/   JRST KILL
NOGOOD+3/   -100,,MEMHOL+7
NOGOOD+4/   T1,,T1
NOGOOD+5/   EQVM F,(.PRSVZ)
NOGOOD+6/   IOR CH,644100(T1)
NOGOOD+7/   SOJ
NOGOOD+10/   EQVM B,475756(T1)
NOGOOD+11/   SETZ
NOGOOD+12/   HLREM T1,560000(B)
NOGOOD+13/   NOGOOD+4
NOGOOD+14/   NOGOOD+5
NOGOOD+15/   NOGOOD+6
NOGOOD+16/   NOGOOD+7
NOGOOD+17/   SETZ NOGOOD+10
NOGOOD+20/   FEEP
NOGOOD+21/   40
NOGOOD+22/   CRLF
NOGOOD+23/   SEP
NOGOOD+24/   COUNT
NOGOOD+25/   MOVEI .PRSVA,40
NOGOOD+26/   POPJ P,
NOGOOD+27/   CAI C,F

The constants section begins at address NOGOOD+3, with the first literal being -100,,MEMHOL+7 (MEMHOL+7) is actually the address of the stack, but the symbol was half-killed, I think.

Note that from NOGOOD+11 to NOGOOD+17, we see the literals that make up the arguments to an OPEN call -- the first call made by PAPSAV. And note that immediately following this, we see at NOGOOD+20, the literal FEEP. This literal is used in the next instruction following the OPEN that uses literals. In other words, this makes perfect sense -- the literals used by the .CALL OPEN and the literals used in the next instruction with a literal reference are back-to-back.

Now, contrast this with the same from my CALREL-using and SYSMSG-macro-using version:

nogood/'RITUAL$:   MOVEI T1,NOGOOD+56
NOGOOD+1/   PUSHJ P,TYPE
NOGOOD+2/   JRST KILL
NOGOOD+3/   -100,,MEMHOL+7
NOGOOD+4/   T1,,T1
NOGOOD+5/   EQVM F,(.PRSVZ)
NOGOOD+6/   IOR CH,644100(T1)
NOGOOD+7/   SOJ
NOGOOD+10/   EQVM B,475756(T1)
NOGOOD+11/   SETZ
NOGOOD+12/   HLREM T1,560000(B)
NOGOOD+13/   NOGOOD+4
NOGOOD+14/   NOGOOD+5
NOGOOD+15/   NOGOOD+6
NOGOOD+16/   NOGOOD+7
NOGOOD+17/   SETZ NOGOOD+10
NOGOOD+20/   SYSMSG+6,,0
NOGOOD+21/   SYSMSG+7,,0
NOGOOD+22/   SETM TT,414141(T1)
NOGOOD+23/   FEEP
NOGOOD+24/   40
NOGOOD+25/   CRLF

Notice, at address NOGOOD+22 and NOGOOD+21, we see SYSMSG+6,,0 and SYSMSG+7,,0. Ignore the literal that follows (NOGOOD+22) -- that was a debugging literal I threw in there to prove what was going on. At NOGOOD+23 we find the FEEP literal.

Those two addresses at NOGOOD+22 and NOGOOD+23 are two of the addresses that are in the patch table as the addresses that must be patched:

*sysmsg+6/   LSH COUNT,
SYSMSG+7/   MOVEI PT,
SYSMSG+10/   JRST PAPSAV

The LSH COUNT, instruction is patched to LSH COUNT, SYSMLN. And the MOVEI PT, instruction is patched as MOVEI PT,SYSMBF.

The patch table, as coded in my hand-code version looks like this:

abstb1: <squoze 0,SYSMBF>       ; system message buffer
        sysmsg+7
        <squoze 0,TOIP>         ; tty output ptr
        syget0
        <squoze 0,TOBEP>        ; end of buffer
        syget0+2
        <squoze 0,TOOP>         ; output buffer output pointer
        papsav+1

immeds: <squoze 0,TOBL>         ; tty output buffer length
        syget0+3
        <squoze 0,SYSCON>       ; system tty number
        papsav
        <squoze 0,SYSMLN>       ; log 2 of number of 4-word blocks
        sysmsg+6
abstb2:

And the table, coded using the SYSMSG macros, looks like

ABSREF [SYSMBF          ;SYSTEM MESSAGE BUFFER
        TOIP            ;TTY OUTPUT PTR
        TOBEP           ;.., END OF BUFFER
        TOOP            ; output buffer output pointer
----
        TOBL            ;TTY OUTPUT BUFFER LENGTH
        SYSCON          ;SYSTEM TTY NUMBER
        SYSMLNG         ;LOG 2 OF NUMBER OF 4-WORD BLOCKS
]

A lot cooler and more elegant, of course. But uses a literal for each symbol in the table, which throws off the constants/literals and thus the addresses of these in all instructions that reference them and thus causing the resulting binary to not match that of ATSIGN PAPSAV.

eswenson1 commented 5 years ago

By the way, I now have a version of PAPSAV that uses CALRET and whose assembly matches exactly ATSIGN PAPSAV. So I'll submit a PR to replace the one that doesn't use CALRET with this one soon, since CALRET was clearly used in the original PAPSAV sources. Should I make the version number PAPSAV 2 and include both, or should I replace PAPSAV 1 with the new version?

larsbrinkhoff commented 5 years ago

Please make a version 2 and have it replace version 1. That way, we can more easily see the diff between the two.

larsbrinkhoff commented 5 years ago

And next you could make version 3 using the ABSREF macro, replacing version 2. It's ok if it doesn't generate identical code as long as it works the same. Version 1 (or 2) can be used to assemble an identical binary if someone wanted to do that.

eswenson1 commented 5 years ago

Ok. Sounds like a plan. If I do that, should both 2 and 3 be made part of the release, or only 3?

larsbrinkhoff commented 5 years ago

Just 3.