Closed larsbrinkhoff closed 1 year ago
I hacked the build script to stop and loop around after SALV finishes successfully, or output a trace if it halts.
Apparently, we take the SOGJ loop after MREAD2, then JRST MREAD3, and there fails the SHORTL skip. But this also happens before a successful skip.
Checking uses of SHORTL, it's normally just checked by the skip instruction and not updated. Out of 5,000,000 instructions in the trace, SHORTL is only updated seven times. So occasionally it's set in MREAD6 and shortly after (461 instructions) cleared in MREAD2.
MREAD2 clears SHORTL if CONI MTS signals an EOFF condition. I assume that's end of file. When SALV halts, this doesn't happen.
Here's what normally happens.
220300 000000000064 222457 476000222457 777777777777 300004 476000222457 SETOM 0,222457
220301 000000000000 000007 734300000007 000000000000 300004 734300000007 CONSZ 340,7
220325 000000000000 000001 734600000001 000000000001 300004 734600000001 CONO 344,1
220326 000002222571 000100 734740000100 000000000100 300004 734740000100 CONSO 344,100
220330 000000000013 440000 734700440000 000000000000 300004 734700440000 CONSZ 344,440000
220332 000002222571 020600 734740020600 000000000000 300004 734740020600 CONSO 344,20600
220250 000000000000 560200 734200560200 000000560200 300004 734200560200 CONO 340,560200
220251 256011000010 000007 734340000007 000000000000 300004 734340000007 CONSO 340,7
220260 000000000000 562200 734200562200 000000562200 300004 734200562200 CONO 340,562200
220261 000002222571 014101 734740014101 000000000000 300004 734740014101 CONSO 344,14101
220261 000002222571 014101 734740014101 000000010100 300004 734740014101 CONSO 344,14101 <<<
220265 000000000002 000003 734640000003 000170011142 300004 734640000003 CONI 344,3 <<<
220276 000000000064 222457 402000222457 000000000000 300004 402000222457 SETZM 0,222457
220301 000000000000 000007 734300000007 000000000000 300004 734300000007 CONSZ 340,7
220325 000000000000 000001 734600000001 000000000001 300004 734600000001 CONO 344,1
220326 000002222571 000100 734740000100 000000000100 300004 734740000100 CONSO 344,100
220330 000000000000 440000 734700440000 000000000000 300004 734700440000 CONSZ 344,440000
220332 000002222571 020600 734740020600 000000000000 300004 734740020600 CONSO 344,20600
220250 000000000000 560200 734200560200 000000560200 300004 734200560200 CONO 340,560200
220251 256011000010 000007 734340000007 000000000000 300004 734340000007 CONSO 340,7
220260 000000000000 562200 734200562200 000000562200 300004 734200562200 CONO 340,562200
220261 000003222572 014101 734740014101 000000000000 300004 734740014101 CONSO 344,14101
220261 000003222572 014101 734740014101 000000000001 300004 734740014101 CONSO 344,14101
220265 000000000760 000003 734640000003 000175000001 300004 734640000003 CONI 344,3
220317 000000000007 000007 734300000007 000000000000 300004 734300000007 CONSZ 340,7
220322 000000000064 222457 000000000000 000000000000 300004 332000222457 SKIPE 0,222457
But before halt.
220300 000000000064 222457 476000222457 777777777777 300004 476000222457 SETOM 0,222457
220301 000000000000 000007 734300000007 000000000000 300004 734300000007 CONSZ 340,7
220325 000000000000 000001 734600000001 000000000001 300004 734600000001 CONO 344,1
220326 000002222571 000100 734740000100 000000000100 300004 734740000100 CONSO 344,100
220330 000000000000 440000 734700440000 000000000000 300004 734700440000 CONSZ 344,440000
220332 000002222571 020600 734740020600 000000000000 300004 734740020600 CONSO 344,20600
220250 000000000000 560200 734200560200 000000560200 300004 734200560200 CONO 340,560200
220251 256011000010 000007 734340000007 000000000000 300004 734340000007 CONSO 340,7
220260 000000000000 562200 734200562200 000000562200 300004 734200562200 CONO 340,562200
220261 000002222571 014101 734740014101 000000000000 300004 734740014101 CONSO 344,14101
220261 000002222571 014101 734740014101 000000000001 300004 734740014101 CONSO 344,14101
220265 000040372700 000003 734640000003 000175000001 300004 734640000003 CONI 344,3 <<<
220317 000000000000 000007 734300000007 000000000000 300004 734300000007 CONSZ 340,7 <<<
220322 000000000064 222457 777777777777 777777777777 300004 332000222457 SKIPE 0,222457
In the first run, CONI returns with JOBDON and EOFF set. This causes SHORTL to be cleared.
In the second run, CONI returns with DATREQ set. SALV doesn't like that.
@rcornwell suggested SALV may be working as intended, and that the tape is in error. That turned out to be the case. There's an EOF tape mark missing after a file on the tape. Supposedly that's why SALV halts, although I haven't confirmed this. But it does halt before processing the next file.
So the question now is: why is the tape malformed? It's written by DUMP under ITS. DUMP will happily list all files from the bad tape; apparently it doesn't care about the missing mark. itstar also doesn't detect any problem.
I ran the build script in a loop to the point where it writes the reboot.tape image and had it break when the tape is missing an EOF mark between files. I had mta debug turned on. I'm attaching the debug output and the tape file.
As you can see, the tape is missing between SYSTEM; CH11 DEFS1 and SYSTEM; CHAOS 290. Below are some lines I grepped out from the log file. I grepped for "Record Writ", "XXX", and "WTM". XXX are my own annotations to show the first record from the two files. There is a WTM between the two, so there should be a mark. CH11 DEFS is a short file, so it's just one record. CHAOS 290 is a longer file, so the first record is 1024 words, or hex 1400 frames.
DBG(11340477055)> MTA STR: MTA0 Record Write len: 00000B09
XXX SYSTEM; CH11 DEFS1
DBG(11340477055)> MTA STR: MTA0 Record Written len: 00000B0A
XXX SYSTEM; CH11 DEFS1
DBG(11340479792)> MTA DETAIL: MT0 WTM
DBG(11340482155)> MTA STR: MTA0 Record Write len: 00000000
DBG(11342638172)> MTA STR: MTA0 Record Write len: 00001400
XXX SYSTEM; CHAOS 290
DBG(11342638172)> MTA STR: MTA0 Record Written len: 00001400
XXX SYSTEM; CHAOS 290
DBG(11344790550)> MTA STR: MTA0 Record Write len: 00001400
I'm puzzled, because everything seems right, and the WTM log line indicates sim_tape_wrtmk
is called.
Hmm, an oddity here:
DBG(11340479792)> MTA DETAIL: MT0 WTM
DBG(11340482155)> MTA STR: MTA0 Record Write len: 00000000
Nowhere else does it say "Record Write len: 00000000".
Another run has the same anomaly. There's a WTM and then two 0-length records, and then the first (and here, only) record of a file. Yet, the image file has no tape mark here. In fact, the last record of the previous file seems missing. (It's as if writing the 0-length record erases records backwards.)
DBG(11882976663)> MTA STR: MTA0 Record Write len: 0000007D
DBG(11882976663)> MTA STR: MTA0 Record Written len: 0000007E
DBG(11882978258)> MTA DETAIL: MT0 WTM
DBG(11882980625)> MTA STR: MTA0 Record Write len: 00000000
DBG(11882985654)> MTA STR: MTA0 Record Write len: 00000000
DBG(11883687221)> MTA STR: MTA0 Record Write len: 00000672
DBG(11883687221)> MTA STR: MTA0 Record Written len: 00000672
It would be nice to see the CONO/DATAIO around the write 0
BG(11882979060)> MTA CONO: MT CONO 347 control 70200 0 476372406424 000000000000
DBG(11882979067)> MTA CONO: MT CONO 343 start 60200 0 21 000000060221 000000000000 PC=030135
DBG(11882979067)> MTA EXP: Setting status 000000000002
DBG(11882979205)> MTA CONO: MT CONO 343 start 64200 0 21 000000064221 000000000000 PC=030272
DBG(11882979212)> MTA CONI: MT CONI 346 status2 000170000040 0 000000000040 PC=030111
DBG(11882979218)> MTA CONI: MT CONI 346 status2 000170000040 0 000000000040 PC=030114
DBG(11882979234)> MTA CONI: MT CONI 346 status2 000170000040 0 000000000040 PC=030155
DBG(11882980205)> MTA EXP: MT0 Init write
DBG(11882980625)> MTA STR: MTA0 Record Write len: 00000000
DBG(11882980625)> MTA DETAIL: MT0 Write 0
Here's the SIMH debug log, and the reboot.tape file that was created.
http://lars.nocrew.org/tmp/debug.tgz
The patch seems to fix it. It held up 60 runs.
I'll make some notes here about the current SALV problem.
Occasionally during the ITS build, the TRAN subroutine in SALV will halt. It can look like this:
The address 220323 can be resolved by checking SALV BIN. It's READ3+5:
SHORTL
is cleared in two places and set in one. First, in the rewind subroutine. I don't think this is active since SALV is busy reading files from the tape.Second, the tape read subroutine. It's likely this is being called when reading files.
Finally, it's set further down in MREAD: