Chaosnet / chaosnet-tools

Tools for Chaosnet
http://chaosnet.net/
3 stars 1 forks source link

rtape: implement rewind-unload command #19

Open ams opened 2 months ago

ams commented 2 months ago

I cannot backup more than about 5MiB of data on the Lisp Machine. The tape driver gets a end of tape encountered, and then poops.

bictorv commented 2 months ago

Please elaborate? I just tried making a partial backup of CDR, resulting in 9.8M.

ams commented 2 months ago
(defvar files (tape:list-all-files))
FILES
(tape:backup-files files  :tape-name "ams-fc-full")
Setting host to AMS-BRIDGE-1
Setting device unit to tape name "ams-fc-full"
Backing-up 3463 files: 135,589,809 total bytes
Writing file: FS: BACKUP-LOGS; AMS-FC.BACKUP-LOG#1
Writing file: FS: L; -READ-.-THIS-#2
...
Comparing "FS: L.DEMO; ALARAM.LISP#50" ... [Equal]
** End of Tape **
All files compared were equal.
Setting backup bits ... done.
Unloading tape ...
>>ERROR: Stream closed while reading status: #<RTAPE-STATUS ams-fc-full  TAPE::MOUNTED>
While in the function (:METHOD TAPE::RTAPE-DEVICE  :PROBE-STATUS)
  <- (:METHOD TAPE::RTAPE-DEVICE :UNLOAD) <- (:METHOD TAPE::LMFL-FORMAT :UNLOAD)
...

I get this very consitently and can't backup more than about 5MiB.

ams commented 2 months ago

Output from RTAPE server:

ams@carbonium chaosnet-tools % ./rtape
11:48:31: Open connection from 04402
Peer 04402: Mount: type=BOTH, reel=ANY, drive=ams-fc-full, size=4096, density=1600
Peer 04402: Unknown operation: 8
larsbrinkhoff commented 2 months ago

That's a crucial clue. Operation 8 is "rewind-unload" which I haven't implemented. The elegant error handling consists of calling exit(1). Patches welcome.

ams commented 2 months ago

How would we want this to work? Just continue with the same file, or do -.tap kinda thing where idx is incremented?

ams commented 2 months ago

But isn't it still strange that the tape size is just ~5MiB? That would take lots of tapes to just backup 135MiB which is the System tree.

bictorv commented 2 months ago

Yes, it's strange that you get an End-of-tape. I don't see how you set up your tape device, your transcript just says "Setting host..." and "Setting device unit...". Did you do (tape:select-device ...) before the call to tape:backup-files?

I don't get an end-of-tape, and thus the :unload method isn't called, and the backup finishes happily. Could it be something happening on the rtape server host, like the disk getting full? Or could it be related to differences in offsets between rtape.c and rtape-device.lisp? What's the value of tape:rtape-dlen (should be 16), and at what index are flags put in send_status in rtape.c (should be 34, 35)?

How would we want this to work? Just continue with the same file, or do -.tap kinda thing where idx is incremented?

backup-files calls prompt-for-new-tape which will call rtape :set-options which will pop up a tv:choose-variable-values window. If you give the same unit name, you'll overwrite the last tape.

How you want it to work: like for me, i.e., don't get an end-of-tape?

eswenson1 commented 2 months ago

Is there a possibility that you are setting the tape density/length (or there is a default value) that causes the LM client software to only write 5M before “wanting a new tape”?

ams commented 2 months ago

I'm not setting anything special, just MAKE-SYSTEM on TAPE, and then (tape:backup-files ...) which asks me about the host. The RTAPE host has plenty of disk left. @eswenson1 Not setting anything, so I'm confused why @bictorv is getting a different result, and why he has a rtape-dlen of 16..

(defconst rtape-DLEN        15)     ;Note: 16 *including* namelength

@bictorv Doesn't tape:select-device do the same as the :tape-name ?

As for how this should work, I think 'infinite' tape would be the best? Getting asked for new tapes constantly would be annoying.

As for operation 8, should it really be rewind-unload -- and not offline?

(defconst rtape-LOGIN-OPCODE        1)
(defconst rtape-MOUNT-OPCODE        2)
(defconst rtape-PROBE-OPCODE        3)
(defconst rtape-READ-OPCODE     4)
(defconst rtape-WRITE-OPCODE        5)
(defconst rtape-REWIND-OPCODE       6)
(defconst rtape-REWIND-SYNC-OPCODE  7)
(defconst rtape-OFFLINE-OPCODE      8)
(defconst rtape-FILEPOS-OPCODE      9)
(defconst rtape-BLOCKPOS-OPCODE 10)
(defconst rtape-WRITE-EOF-OPCODE    12)
(defconst rtape-CLOSE-OPCODE        13)
(defconst rtape-LOGIN-RESPONSE-OPCODE   33)
(defconst rtape-DATA-OPCODE     34)
(defconst rtape-EOFREAD-OPCODE      35)
(defconst rtape-STATUS-OPCODE       36)
bictorv commented 2 months ago

You have an old version of rtape-device - but setting rtape-DLEN to 16 should suffice for now. Perhaps you also have an old version of rtape.c, which would make things work, but if they mismatch you will get problems - you could randomly get an EOF flag, perhaps. Check your rtape.c (send_status should put flags at offset 34+35, and the last four lines (after buf[34] |= FLG_STRG) should use 36 as constant instead of 35.

rewind-unload of a real tape would put the tape offline, so probably means the same as offline.

bictorv commented 2 months ago

@bictorv Doesn't tape:select-device do the same as the :tape-name ?

It essentially does, if selected-device is an RTAPE device. :tape-name sets the unit only.

ams commented 2 months ago

You have an old version of rtape-device - but setting rtape-DLEN to 16 should suffice for now. Perhaps you also have an old version of rtape.c, which would make things work, but if they mismatch you will get problems - you could randomly get an EOF flag, perhaps. Check your rtape.c (send_status should put flags at offset 34+35, and the last four lines (after buf[34] |= FLG_STRG) should use 36 as constant instead of 35.

I'll double check .. I don't think I have an old version of rtape-device.lisp (version 4) -- rtape.c is definitely from master.

Though I see that I have this going on, it is possible that I maybe messed up something.

Index: tape/rtape-device.lisp
==================================================================
--- tape/rtape-device.lisp
+++ tape/rtape-device.lisp
@@ -36,11 +36,11 @@
 (defconst rtape-LOGIN-RESPONSE-OPCODE  33)
 (defconst rtape-DATA-OPCODE        34)
 (defconst rtape-EOFREAD-OPCODE     35)
 (defconst rtape-STATUS-OPCODE      36)

-(defconst rtape-DLEN       16)
+(defconst rtape-DLEN       15)     ;Note: 16 *including* namelength
 (defconst rtape-MAXSTRING  100)

 (defconst rtape-operations '(
                 rtape-LOGIN-OPCODE
                 rtape-MOUNT-OPCODE
@@ -340,23 +340,21 @@

 (defmethod (rtape-device :reset) ()
   (send self :deinitialize))

-
-(defmethod (rtape-device :status) ()
-  (when stream
-    (format t "~&Connected to ~S" host)
-    (describe status)))
-
-
 (defmethod (rtape-device :speed-threshold) (record-size)
   record-size)

 (defun rtape-unimplemented ()
   (declare (eh:error-reporter))
-  (cerror "do nothing" "unimplemented"))
+  ;; (cerror "do nothing" "unimplemented")
+  ;; Signal something which can be taken care of in other code!
+  (signal 'driver-error :device-type 'rtape-device
+     :error-code #x42
+     :error-message "Operation Not Yet Implemented")
+  )

 ;;; Tape positioning
 ;;;

 (defmethod (rtape-device :rewind) (&optional (wait-p t))
bictorv commented 2 months ago

I'll double check .. I don't think I have an old version of rtape-device.lisp (version 4) -- rtape.c is definitely from master.

I'll send you my latest, version 9. Or you can pick it up from CDR:BV.TAPE; yourself.

As for the problem of getting an end-of-tape: the next time you get it, try (describe (send tape:*selected-device* :status)).

ams commented 2 months ago

We need better ways of syncing these hacks.

larsbrinkhoff commented 2 months ago

I suspect after rewind-unloading, the operator is supposed to mount another reel?

ams commented 2 months ago

I would assume so, yes. At least that is the behaviour Tape currently expects.