Not receiving `OO` causes session to get stuck

sorenisanerd commented 2 years ago

It seems like a relatively common problem that the terminating OO from sz gets lost or misparsed or something. See all these issues: https://github.com/sorenisanerd/gotty/issues/46 https://github.com/tsl0922/ttyd/issues/239 https://github.com/tsl0922/ttyd/issues/279 https://github.com/Eugeny/tabby/issues/6259 https://github.com/cferdinandi/tabby/issues/132 https://github.com/Eugeny/tabby/issues/5196 https://github.com/trzsz/tabby-trzsz/issues/2 https://github.com/Eugeny/tabby/issues/5132

This lets me call .allow_missing_OO(true) on the session object which SEEMS to make the problem go away:

diff --git a/src/zsession.js b/src/zsession.js
index 5f0b8f9..cafe57f 100644
--- a/src/zsession.js
+++ b/src/zsession.js
@@ -193,6 +193,10 @@ Zmodem.Session = class ZmodemSession extends _Eventer {
         //console.warn("Invalid first Zmodem header", hdr);
     }

+    allow_missing_OO(flag) {
+        this._allow_missing_OO = !!flag;
+    }
+
     /**
      * Sets the sender function that a Session object will use.
      *
@@ -549,7 +553,7 @@ Zmodem.Session.Receive = class ZmodemReceiveSession extends Zmodem.Session {
             if (this._input_buffer.length < 2) return;

             //if it’s OO, then set this._bytes_after_OO
-            if (Zmodem.ZMLIB.find_subarray(this._input_buffer, OVER_AND_OUT) === 0) {
+            if (this._allow_missing_OO || Zmodem.ZMLIB.find_subarray(this._input_buffer, OVER_AND_OUT) === 0) {

                 //This doubles as an indication that the session has ended.
                 //We need to set this right away so that handlers like

...but I can't reliably trigger the problem, so my confidence in this approach is not great.

Do you have thoughts on how to address this? The spec suggests that if the OO does not arrive, you should just move on with your life after a brief timeout, so maybe we just need to implement that timeout? Just getting stuck in this state doesn't benefit anyone.

FGasper commented 2 years ago

Yeah a timeout seems reasonable; all the same, the terminating OO should come since this is all TCP.

sorenisanerd commented 2 years ago

It very much should! I have spent hours and hours trying to pinpoint how they get lost, but to no avail.

On Tue, Oct 18, 2022 at 4:58 PM Felipe Gasper @.***> wrote:

Yeah a timeout seems reasonable; all the same, the terminating OO should come since this is all TCP.

— Reply to this email directly, view it on GitHub https://github.com/FGasper/zmodemjs/issues/33#issuecomment-1283153992, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABHCWXMFXBW6WRQ6LGFCBLWD42RPANCNFSM6AAAAAARIRSPRM . You are receiving this because you authored the thread.Message ID: @.***>

FGasper commented 2 years ago

I wonder if it’s a bug in sz. Maybe it neglects to flush its output buffer or something.

Maybe _consume_ZFIN() could be where the timeout starts, at the end of which this._on_session_end() would be called and _bytes_after_OO gets set to an empty array … and if the usual OO arrives then we clear the timeout.

sorenisanerd commented 2 years ago

I suspected sz, too. I've stared at it a LOT today and it doesn't seem to be doing anything out of the ordinary. https://github.com/UweOhse/lrzsz/blob/master/src/lsz.c#L1855-L1857 is pretty standard.

I haven't been able to trigger it if I run either sz, bash, or gotty under strace :(

Either way, if sz is broken, we still need to support current versions of it.

destinytaoer commented 1 year ago

1685008392368-7a637501-cb3c-4302-b7a0-4372833f40ee

i have same problem. when i use sz to download file, sometimes it return ZFIN at the end as described in the picture above, but sometimes it doesn't return and stuck my terminal. it more frequent occurrences when the file is larger than 100MB

it stucks the tty, not my frontend terminal. maybe it is a problem of lrzsz? what can i do for it?

In my computer, it works fine. but when i use it in k8s exec api to access a pod, that error comes

1684935844327-9eef2bcd-96ce-4553-9d22-b014088a91db

1684935745685-8b2da49c-de24-4c13-a77c-74842dc25e67

FGasper / zmodemjs

Not receiving `OO` causes session to get stuck #33