artyom-poptsov / guile-ssh

Guile-SSH is a library that provides access to the SSH protocol for GNU Guile programs.
https://memory-heap.org/~avp/projects/guile-ssh
GNU General Public License v3.0
65 stars 13 forks source link

timeout on channel read returns <eof> #29

Closed Apteryks closed 2 years ago

Apteryks commented 3 years ago

Hello,

I've been trying to understand a problem in Guix where reading from a SSH channel returns EOF. My debugging led me to find that it occurs when there's nothing to read on the channel passed the channel specified timeout value.

Normally the underlying libssh ssh_channel_read returns 0 when there's nothing to read or an error if there was an error, not EOF, IIUC. Is this expected behavior? If so, how can someone discriminate a timeout from a true EOF?

Thank you,

Maxim

Apteryks commented 3 years ago

This is what we see when stracing Guile on the client side (guile-ssh):

[pid  4311] poll([{fd=19, events=POLLIN}], 1, 15000) = 0 (Timeout)
[pid  4311] write(1, "$8 = #<eof>\n", 12) = 12

The timeout is set to 15 s; it polls for that long then returns EOF.

Apteryks commented 3 years ago

Dunno if it's a bug or by design, but IIUC, libssh ssh_channel_poll will return 0 (the length of the stdbuffer) when a timeout elapses during polling (and not SSH_AGAIN, as one might expect).

This condition doesn't seem to be expected in guile-ssh read_from_channel_port (it expects either SSH_ERROR, SSH_EOF or a positive value).

Apteryks commented 3 years ago

So what happens on a timeout is that ssh_channel_read returns 0 (the same as when it encounters EOF). So probably guile-ssh just treats it as an EOF, since it doesn't have more information to work with.

Apteryks commented 3 years ago

Yep, seems to be Guile's peek_byte_or_eof that chooses to return EOF when nothing was available/read.

artyom-poptsov commented 3 years ago

Hello,

what version of GNU Guile do you use?

Apteryks commented 3 years ago

Hello! It should be Guile 3.0.7, I think (the one used by Guix).

Thanks!

artyom-poptsov commented 3 years ago

Hello again,

could you please check this branch https://github.com/artyom-poptsov/guile-ssh/tree/wip-fix-nonblocking-eof and see what will happen?

artyom-poptsov commented 3 years ago

Besides, you can build Guile-SSH for debugging like follows:

CFLAGS=-DDEBUG make -e -j4

I added extra debug traces in the channels code.

Apteryks commented 3 years ago

I've updated the guile-ssh package locally to build from commit 2e25d852104f375936e81d9d7163892c6e828e68 and ran:

$ ./pre-inst-env guix offload test /etc/guix/machines.scm tm
guix offload: testing 1 build machines defined in '/etc/guix/machines.scm'...
Backtrace:
In ice-9/boot-9.scm:
  1752:10 11 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In unknown file:
          10 (apply-smob/0 #<thunk 7f1b1ad29f60>)
In ice-9/boot-9.scm:
    724:2  9 (call-with-prompt _ _ #<procedure default-prompt-handler (k proc)>)
In ice-9/eval.scm:
    619:8  8 (_ #(#(#<directory (guile-user) 7f1b1ad23c80>)))
In guix/ui.scm:
   2205:7  7 (run-guix . _)
  2168:10  6 (run-guix-command _ . _)
In ice-9/boot-9.scm:
  1752:10  5 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _)
In guix/scripts/offload.scm:
   704:21  4 (check-machine-availability _ _)
In srfi/srfi-1.scm:
   586:17  3 (map1 (#<session root@10.42.0.243:22 (connected) 7f1b162acfc0>))
In guix/inferior.scm:
    259:2  2 (port->inferior _ _)
    241:2  1 (read-repl-response _ _)
In ice-9/boot-9.scm:
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `match-error' with args `("match" "no matching pattern" #<eof>)'.

So immediately I don't see a change; but I also don't see debug traces so I'm not sure I'm testing it correctly (I've specified CFLAGS=-DDEBUG as a configure flag).

Hellseher commented 3 years ago

@Apteryks May you share your Guile code which you used for test and your package specification. I can't reproduce it localy with guile-ssh 0.13.1

uix describe
Generation 173  Nov 12 2021 21:01:27    (current)
  guix da73727
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: da73727f1a1c49bd0b834d2d4da48d578062b0ae
Apteryks commented 2 years ago

I cannot reproduce this anymore with the same setup but with a newer Guix that uses the recently released guile-ssh 0.15.1: even though the low spec server is busy doing something, the client (guile-ssh 0.15.1) waits for it without reporting EOF, it seems.

I guess it can be closed :-).

Thank you!