Closed trimpim closed 1 month ago
I merged the commit but are still hesitant to close this issue. For example, send_sig()
is used in sk_stream_error()
to signal EPIPE
errors. @ssumpf do you already have an opinion on that? May you look into this?
I merged the commit but are still hesitant to close this issue. For example,
send_sig()
is used insk_stream_error()
to signalEPIPE
errors. @ssumpf do you already have an opinion on that? May you look into this?
@trimpim: Could you provide a backtrace in the case send_sig
is called?
@ssumpf sorry for the delay. Unfortunately I hadn't recorded the backtrace before. So I had to run the tests again.
BOARD=linux ; KERNEL=linux ; ARCH=x86_64
remote_access -> ssh_server] 0x1000000 .. 0x10ffffff: linker area
remote_access -> ssh_server] 0x40000000 .. 0x4fffffff: stack area
remote_access -> ssh_server] 0x50000000 .. 0x521b2fff: ld.lib.so
remote_access -> ssh_server] 0x10e17000 .. 0x10ffffff: libc.lib.so
remote_access -> ssh_server] 0x10d73000 .. 0x10e16fff: vfs.lib.so
remote_access -> ssh_server] 0x103f000 .. 0x10d7fff: libssh.lib.so
remote_access -> ssh_server] 0x10d8000 .. 0x165efff: libcrypto.lib.so
remote_access -> ssh_server] 0x165f000 .. 0x1675fff: zlib.lib.so
remote_access -> ssh_server] 0x10d62000 .. 0x10d72fff: vfs_lxip.lib.so
remote_access -> ssh_server] 0x1676000 .. 0x18a4fff: lxip.lib.so
...
remote_access -> ssh_server] _genode_errno:96 unsupported errno 104
remote_access -> ssh_server] Error: Function send_sig not implemented yet!
remote_access -> ssh_server] backtrace "ep"
remote_access -> ssh_server] Will sleep forever...
The message _genode_errno:96 unsupported errno 104
is printed multiple times. If I haven't overlooked anything, it starts appearing after ~15 SSH logouts and normally comes directly before the logout. I'm not sure if it is relevant, but it always is directly before Error: Function send_sig not implemented yet!
remote_access -> ssh_server] Error: Function send_sig not implemented yet! remote_access -> ssh_server] backtrace "ep" remote_access -> ssh_server] Will sleep forever...
There's actually no backtrace here. Please enable -fno-omit-frame-pointer and rebuild lib/vfs_lxip.
The message
_genode_errno:96 unsupported errno 104
is printed multiple times. If I haven't overlooked anything, it starts appearing after ~15 SSH logouts and normally comes directly before the logout. I'm not sure if it is relevant, but it always is directly beforeError: Function send_sig not implemented yet!
Errno 104 is ECONNRESET and indeed missing from _genode_errno. You may try to add it to the following files.
@trimpim: Additionally to @chelmuth comments, you can dump the resulting trace in the new backtrace
found in the tool directory. For this to work:
cd <build-dir>/debug
<genode>/tool/backtrace ssh_server
paste:
remote_access -> ssh_server] 0x50000000 .. 0x521b2fff: ld.lib.so
remote_access -> ssh_server] 0x10e17000 .. 0x10ffffff: libc.lib.so
remote_access -> ssh_server] 0x10d73000 .. 0x10e16fff: vfs.lib.so
remote_access -> ssh_server] 0x103f000 .. 0x10d7fff: libssh.lib.so
remote_access -> ssh_server] 0x10d8000 .. 0x165efff: libcrypto.lib.so
remote_access -> ssh_server] 0x165f000 .. 0x1675fff: zlib.lib.so
remote_access -> ssh_server] 0x10d62000 .. 0x10d72fff: vfs_lxip.lib.so
remote_access -> ssh_server] 0x1676000 .. 0x18a4fff: lxip.lib.so
and than the actual backtrace into the terminal.
There's actually no backtrace here. Please enable -fno-omit-frame-pointer and rebuild lib/vfs_lxip.
@chelmuth how do I do this again? If I remember correctly, I had to add an option to etc/tools.conf
but I haven't used this in a long time and use a new computer since then.
There's actually no backtrace here. Please enable -fno-omit-frame-pointer and rebuild lib/vfs_lxip.
@chelmuth how do I do this again? If I remember correctly, I had to add an option to
etc/tools.conf
but I haven't used this in a long time and use a new computer since then.
@trimpim: You can add CC_OPT += -fno-omit-frame-pointer
to your etc/tools.conf
, but you have to make sure all the libs above are in your build
command in the run script, not from the depot, otherwise the option will be ignored.
@ssumpf thanks for the info. This makes me realize, that my run script, which produces the error, only uses depot archives, which are started/stopped using the depot_deploy
mechanism.
If I can change some of the depot tooling to build depots with -fno-omit-frame-pointer
, then I'm fine with recompiling the whole depot content for the test.
@trimpim: You can add it to the CC_OPT
in base/mk/global.mk
. This will enable the backtrace. You still need to build the ssh_server
with the same options in your build directory to use the backtrace
tool, though.
@ssumpf here the processed output
void Genode::log<Genode::Backtrace>(Genode::Backtrace&&)
* 0x170e6f3: lxip.lib.so:0xa16f3 W
* /data/genode/repos/base/include/base/log.h:170
lx_emul_trace_and_stop
* 0x170e7f0: lxip.lib.so:0xa17f0 T
* /data/genode/repos/base/include/base/log.h:86
send_sig
* 0x16ba47b: lxip.lib.so:0x4d47b T
* ??:?
sk_stream_error
* 0x177a71a: lxip.lib.so:0x10d71a T
* /data/genode/contrib/linux-d8c12b28a8ba8bddc3b0d12c2e3cb369fdfd5c75/src/linux/net/core/stream.c:191
tcp_sendmsg_locked
* 0x17c6437: lxip.lib.so:0x159437 T
* /data/genode/contrib/linux-d8c12b28a8ba8bddc3b0d12c2e3cb369fdfd5c75/src/linux/include/net/tcp.h:1891
tcp_sendmsg
* 0x17c6881: lxip.lib.so:0x159881 T
* /data/genode/contrib/linux-d8c12b28a8ba8bddc3b0d12c2e3cb369fdfd5c75/src/linux/net/ipv4/tcp.c:1485
lx_socket_sendmsg
* 0x1719265: lxip.lib.so:0xac265 T
* /data/genode/repos/dde_linux/src/lib/lxip/lx_socket.c:412
Lx_sendmsg::execute()
* 0x18042fc: lxip.lib.so:0x1972fc W
* /data/genode/repos/dde_linux/src/lib/lxip/socket.cc:342
Lx_kit::Task::run()
* 0x1718094: lxip.lib.so:0xab094 T
* /data/genode/repos/base/include/base/log.h:193
@trimpim: Thanks for the backtrace. send_sig
is called in sk_stream_error
in case error is EPIPE and MSG_NOSIGNAL flag is not set (the error happens probably in sk_stream_wait_connect
). This looks somewhat optional, and therefore, I would suggest to keep your commit and dummy implement the function.
As for ECONNRESET, I will add it to the IP-stack, even though neither EPIPE nor ECONNRESET are handled/propagated by the VFS plugin at the moment.
Does your SSH server scenario work with the dummy implementation or are there any other issues?
@ssumpf Thanks fro the information and offering to add the error codes.
After adding the patch there is one test in an other component that fails, that did work with the old vfs_lxip
and works with vfs_lwip
. The test sends a really large packet (128Kib) to the azure management client. After receiving this message the azure management client can no longer send messages to the internet. I will try to capture the traffic and create an other issue for that if you are fine with that.
@trimpim: I will open an issue soon that will address lxip tests that currently still fail during nightly testing. You can put your issue there if you want.
@ssumpf I'll do that.
@ssumpf thanks, cherry picked to our working branch.
Merged to master.
This is function gets called by some libssh applications using vms_lxip.
For the dummy implementation I looked at the old port.