telescope-browser / telescope

browser for the small internet
https://telescope-browser.org
ISC License
43 stars 1 forks source link

Error loading - buffer event error #11

Closed valadect closed 3 months ago

valadect commented 7 months ago

Steps to reproduce.

  1. Navigate to gemini://gnubox.org/
  2. Throws error

More Info

This happens in both the latest release as well as when building from source.

The same capsule works fine on bombadillo and lagrange.

omar-polo commented 7 months ago

Hello @valadect , thanks for the report. I can reproduce (with any capsule actually) on OpenBSD using libevent2 from packages. I've been bitten again by some differences in libevent 1.x (which is in base on OpenBSD) and libevent2.

omar-polo commented 7 months ago

Yeah, this is due to me wanting to use libtls and reaching into libevent' bufferevent abstraction. It'll take me a few days to fix this unfortunately, it's not straightforward.

valadect commented 7 months ago

Best of luck! At least on my end it's only the odd capsule here and there so take all the time you need.

omar-polo commented 7 months ago

Actually my testing was busted. I was in a hurry and haven't noticed that I was mixing libevent 1.x from base and libevent2.x from ports.

Now that I have more time I tried again and can't reproduce. I tested on alpine linux using libevent 2.1.12 and libretls 3.7.0. (this was a few hours ago)

Right now the capsule seems down, but I get the same error in telescope, gg(1) and lagrange:

% gg gemini://gnubox.org
gg: handshake: handshake failed: error:02FFF036:system library:func(4095):Connection reset by peer

maybe they're doing maintenance right now.

Can you please tell me a bit more about your system? (OS, version) I'd like to replicate and understand this issue.

Thanks!

valadect commented 7 months ago

Yeah seems to be down at the moment. I've been trying to find another capsule with the same issue but no luck so far.

I've run both latest git as well as latest tagged release and got the same results. I'm running telescope on Fedora Asahi Remix (Fedora 39) aarch64.

All other capsules I've come across work fine, just strange that only telescope was the one that couldn't load it previously.

omar-polo commented 7 months ago

Maybe I have finally a clue.

After another user report, I looked closely and it seems that using openssl (on linux) I sometimes get the bufferevent error (now "read error") on some capsules due to the missing close_notify. I couldn't reproduce with gnubox.org, but gemini://gmi.noulin.net/ quite often results in the error here.

Can you reproduce it too? If not, could you please checkout the debug-tls branch, run make && ./telescope 2>>log and post the contents of the file log after reproducing the issue?

valadect commented 7 months ago

Can confirm I'm getting a buffer event error on that capsule for 0.8.1.

In debug-tls it's a read error.

failure(s) in tls_read:
- error:0A000126:SSL routines::unexpected eof while reading
failure(s) in tls_close:
- error:0A000197:SSL routines::shutdown while in init

Thanks for sticking with this!

omar-polo commented 7 months ago
  • error:0A000126:SSL routines::unexpected eof while reading

that's the missing close_notify.

unfortunately I can't do anything about it, it depends on the TLS library used by libtls. It seems that LibreSSL is a bit more permissive and reports the failure later, in fact on OpenBSD I still manage to read these capsules, while newer OpenSSL (I guess 3.x+, still have to double check) is more strict.

From the point of view of a TLS library a missing close_notify matters a lot, since it means that the connection could have been abruptly interrupted.

gmniserv is one of the misbehaving servers, and it's also unmaintained :(

I'll try at least to improve the logging in these cases, so it's easier to understand what's going wrong.

valadect commented 6 months ago

Bummer thanks for checking it out, feel free to close.

sikmir commented 6 months ago

I have similar problem on macos (telescope 0.9, libretls):

$ sudo dtruss telescope gemini://gemini.omarpolo.com
...
recvmsg(0x4, 0x7FF7B6AB75B0, 0x0)        = 0 0
write_nocancel(0x2, "telescope: \0", 0xB)        = 11 0
write_nocancel(0x2, "connection closed\0", 0x11)         = 17 0
write_nocancel(0x2, ": \0", 0x2)         = 2 0
write_nocancel(0x2, "No such file or directory\n\0", 0x1A)       = 26 0
...

telescope 0.8.1 has no such problem.

omar-polo commented 6 months ago

@sikmir this seems like a different issue. That message is issued either by the main process or by the network process when the other party dies. I suspect in this case it's the network process dying. Do you have some core file lying around after it crashes? Maybe you can attach a debugger to the network process and see why it's dying?

It could also be interesting to try to bisect this. I have a suspect it may be related to the new event loop, so knowing if telescope as of b19b8dbca985e2f567bb3f476b116ea18c1ca9a2 works (it's the parent of 98d3e6c172747dc58042bde09a848d3e03572934 where the new event loop was used) could be interesting.

Thanks! :)

sikmir commented 6 months ago

Do you have some core file lying around after it crashes?

No.

It could also be interesting to try to bisect this. I have a suspect it may be related to the new event loop, so knowing if telescope as of b19b8db works (it's the parent of 98d3e6c where the new event loop was used) could be interesting.

No, the same problem with b19b8dbca985e2f567bb3f476b116ea18c1ca9a2.

I guess something wrong with dependencies, since telescope 0.9 works fine on macOS if built with nix (https://github.com/NixOS/nixpkgs/pull/290955), but don't if built with macports (https://github.com/sikmir/macports-ports/blob/telescope/net/telescope/Portfile).

omar-polo commented 6 months ago

@sikmir Oh, I see. It's strange. I don't have a mac so I can't test unfortunately, but if you have some time, a useful thing would be to start telescope one one of the built-in pages (so for e.g. echo about:new > ~/.cache/telescope/session), launch telescope and then attach a debugger to the net process, then attempt to open a page. If my intuition is right, it's the network process dying and you should have a backtrace.

don't know if macos has a working setproctitle, on OpenBSD at least pgrep -lf telescope shows two entries: telescope:net and telescope:ui. Otherwise, if mac doesn't have random pids, the greater one will be the one for the net process.

If it's not the network process crashing somehow, then it must be the main one (the ui). In that case, running gdb telescope (or lldb) and then running telescope should similarly give you a backtrace.

Thanks :)

P.S.: I don't grok nix, but it seems that the derivation still requires libevent, which is no longer a dependency as of 0.9. Regarding the Portfile instead, why are you removing libgrapheme? if found, the bundled version is not used at all, unless there's a bug. I actually could add a --with-libgrapheme flag to assert that it must use the system version and not the bundled one.

sikmir commented 6 months ago

it seems that the derivation still requires libevent, which is no longer a dependency as of 0.9.

Good point, I've missed it.

Regarding the Portfile instead, why are you removing libgrapheme? if found, the bundled version is not used at all, unless there's a bug.

Yes, it's just to be sure.

omar-polo commented 6 months ago

On 2024/02/27 07:56:56 -0800, Nikolay Korotkiy @.***> wrote:

Regarding the Portfile instead, why are you removing libgrapheme? if found, the bundled version is not used at all, unless there's a bug.

Yes, it's just to be sure.

Ah, good, I thought it was still built!

Thanks,

Omar Polo

sikmir commented 6 months ago
otool -L work/telescope-0.9/telescope
work/telescope-0.9/telescope:
    /opt/local/libexec/openssl3/lib/libssl.3.dylib (compatibility version 3.0.0, current version 3.0.0)
    /opt/local/libexec/openssl3/lib/libcrypto.3.dylib (compatibility version 3.0.0, current version 3.0.0)
    /opt/local/lib/libtls.24.dylib (compatibility version 25.0.0, current version 25.1.0)
    /opt/local/lib/libncurses.6.dylib (compatibility version 6.0.0, current version 6.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)

I guess that's the root cause, telescope requires libressl, but libressl and openssl can't be co-installed.

omar-polo commented 6 months ago

@sikmir oh yeah, you can't mix LibreSSL and OpenSSL in the same address space. (there are some tricks, but don't know them and won't recommend either.)

I made telescope link to libssl and libcrypto for the client certificate generation feature, and it's not possible to disable it yet.

I think the best solution would be to pick just one of the two TLS libraries (etiher Libre or OpenSSL) and just stick with that. The namings are unfortunately too close for my taste, but the choices are:

As far as I can see on macports there are both LibreSSL and libretls+OpenSSL packaged, so either should be viable. I'm biased towards LibreSSL, but both works (and in general is better to choose the one more 'popular' for the target system.)

Thanks!

omar-polo commented 3 months ago

This issues has been fixed by @ThomasAdam in a3e4d56b6d9bcfca48f3d8c8f1e526e95b0c2f64 (https://codeberg.org/op/telescope/pulls/3), thank you!

Now telescope would keep rendering what it received and shows a W character in the modeline.