Closed larsbrinkhoff closed 11 months ago
This may look unfamiliar, but is not a typo: DEFINE LOOK [ZZZ
The bracket is there to make the ZZZ dummy have a "balanced" bindclass. It means the macro argument must have balanced brackets, if any. And the argument includes the brackets, if any. This is because the LOOK macro is used either with a plain symbol, or with a [] literal. Without the [ bindclass, the brackets are taken to enclose the macro argument, but not part of the argument.
Thanks. I was looking at that and had no clue what is going on.
A word of warning. The auto backup program uses core links for communication. If everything goes well, they go away after the program completes. However, if something goes wrong, there's a chance there will be data remaining in the core links. This may interfere with a future run.
It's possible to view active core links by making a file listing of any of the core link devices, e.g. CLO:. See SYSDOC;CLO > for details. Often it's possible to empty core links by typing e.g. ^R CLO:BACKUP;DUMP TYO
. If there's nothing there, DDT will hang, which is remedied with ^G.
It should be possible to delete core links, but the SYSDOC file warns there is a bug. I seem to have hit this bug, and I had to clear all the core link data structures by runtime patching ITS.
I wanted to empty the core links before use, but I don't know how! .IOT on an empty file just hangs. Anyone know?
I tried running BACKUP on ES and it seems to have gotten stuck -- with the log file locked, of course so I can't see why. PEEK says:
7 EJS HACTRN LARS HANG > 30 9 0%
17 EJS EMACS EJS 10!0 < 75 27 0% REALTM
20 EJS BACKUP EJS CLOI T11 1 0 0%
21 EJS DUMP EAK CLOI < 24 1 0%
The fact that it is doing a CLOI, and the SNAME of DUMP is EAK suggests it is prompting for something while only having gotten to the EAK directory when dumping.
PEEK details show:
Ch Idx Uname Jname Mode Bks+Wds Rd% Pk File Name
1 21 EJS DUMP R 4+0 16% 3 EAK; EAKPUR 314
0 20 EJS BACKUP W 0+475 2 DRAGON; BACKUP LOG
Chaos network connections:
Idx Usr Uname Jname State Ibf Pbf Nos Ack R Win T Foreign Addr Flag
33 21 EJS DUMP OPEN 15 0 13 0 15 13 BRIDGE 75230
24 buffers, 5 of which are free.
rtape shows this:
Peer 05460: Read record: 40 octets
Peer 05460: Read mark
Peer 05460: Read continuous records
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Peer 05460: Read record: 5120 octets
Stopping the DUMP job shows that it is here:
input/ .IOT 1,4
INPUT+1/ JUMPL 4,INPUT1
INPUT+2/ .IOT 3,4
INPUT+3/ POPJ 17,
INPUT1/ .CALL 701 (OPEN)
INPUT1+1/ .LOSE 1400
INPUT1+2/ JRST INPUT
At INPUT
.
I killed the BACKUP job and looked at DRAGON;BACKUP LOG. It ended like this:
_DUMP I
TAPE NO=48
REEL 0 FIRST USER = %SYS LAST USER = _MSGS_
REWINDING
CHECKING INCREMENTAL DUMPTAPE NO 48 CREATION DATE 231104
REEL NO 0 OF INCREMENTAL DUMP
Remote-Tape protocol error--Record-stream input buffer overflow--record too long ?
_
ES>
So it got the error we frequently get. I'm running a cbridge built from the HEAD of the cleaning-windows branch of chaosnet-bridge. And I'm running with an rtape built from the HEAD of the lars/rtape branch of chaosnet-tools.
So I think I'm running with the latest everything.
Rtape is now on the master branch of chaosnet-tools, but there's no change regarding packet handling. I have run complete "DUMP I" backups many times during testing, and I have not seen this error lately. Your host is Linux, right? I'm on some old software: Ubuntu 16/18, Linux 4.15, glibc 2.23/2.27.
Rtape is now on the master branch of chaosnet-tools, but there's no change regarding packet handling. I have run complete "DUMP I" backups many times during testing, and I have not seen this error lately. Your host is Linux, right? I'm on some old software: Ubuntu 16/18, Linux 4.15, glibc 2.23/2.27.
I compared the branch I used with master branch and there was nothing significantly different. I haven’t managed to do a BACKUP yet on ES, although I’ve tried several times. Always fails in the same way. I’m running Ubuntu 20.04 on the Linode that hosts ES. I can try it on my Ubuntu 22.04 laptop and report on the results, but all my 3 ITS systems there have tiny file systems compared to ES. Also they all run pdp10-k* simulators rather than KLH10, as ES does.
Ah, KLH10. I will test that.
I ran BACKUP again on ES and got the same error:
_REMOTE
TAPE SERVER HOST=5401
DRIVE=i231104-49.dump
READ-ONLY? N
REMOTE TAPE REWOUND
_DUMP I
TAPE NO=49
REEL 0 FIRST USER = %SYS LAST USER = _MSGS_
REWINDING
CHECKING INCREMENTAL DUMPTAPE NO 49 CREATION DATE 231104
REEL NO 0 OF INCREMENTAL DUMP
Remote-Tape protocol error--record type other than data, read-file-mark, or status ?
_
This is the same read error I get when I list a dump.
Note: this time, I used the master branch of chaosnet-tools for rtape. So this still doesn't work for me.
If I simply don't use BACKUP, and use DUMP, I get the same problem while LISTing the tape successfully created by BACKUP. So perhaps we should go ahead and merge this anyway, and continue to track down the problem with RTAPE and CBRIDGE.
Right, it's not likely this is a problem with the BACKUP program, but rather rtape.c, cbridge, or KLH10. I don't have KLH10 set up yet. Eric, have you tested with pdp10-ka? That's what I have been using, and I have not seen the protocol errors for quite a while.
Are you going to address any of my review comments/suggestions? If not, I can approve and we can merge.
I took the incremental backup tape I created with BACKUP on ES and moved it over to the host where EXA (ITS under pdp10-ka) lives. I used DUMP/LIST to list that tape. I did not get the error I reported doing the ICHECK or DUMP/LIST on ES (klh10) on the same tape.
So it does look like the issue is KLH10-related. Please try to run under a KLH10 ITS and see if you can find the issue.
I'm good to go. And I'm building ITS for KLH10 now.
This is a program to make an unattended backup. It runs DUMP as an inferior, mounts a remote tape, and runs an incremental dump. The tape number is determined as one more than the highest number in the tape database. When the backup has finished, it leaves a log in DRAGON;BACKUP LOG.
The program is installed on TT and HX as DRAGON;WEEKLY BACKUP. Anyone else?