omar-polo / gmid

a Gemini server
https://gmid.omarpolo.com
ISC License
98 stars 8 forks source link

Support for raw TCP reverse proxies that talk ProxyProtocol #28

Closed Sir-Photch closed 2 weeks ago

Sir-Photch commented 1 month ago

Hi, I was wondering if you'd have any interest in gmid being able to talk tcp proxy protocol? This would alleviate the need for another instance of gmid acting as the reverse proxy in front of the "actual" gmid server.

To be specific, I already have a nginx reverse proxy that also handles other ports / hosts, and right now, I can configure nginx to proxy_pass TCP streams via the nginx_stream_proxy_module on port 1965 to some other host. This works nicely.

However, now my proxy IP is shown in the logs of gmid, instead of the actual remote host. (Granted, I can log the access at the nginx-site, but that won't give me any info about the actual path that was requested.)

This could be fixed by gmid accepting the proxy protocol, which is designed to preserve the requesting host IP while passing through proxies.

Anyway, thanks for this software. It is very BSD-esque :)

omar-polo commented 1 month ago

Sorry for the delay, somehow I missed this issue. I'm not very familiar with the proxy protocol, but it seems something that definitely belongs to gmid scope. So yeah, it's something I'd be willing to implement (or to merge a patch for :P)

It would also make sense for gmid to support the proxy protocol when it is proxying the request to another server, for symmetry.

Sir-Photch commented 1 month ago

There is already a library for this on github: https://github.com/kosmas-valianos/libproxyprotocol

What is your stance on external dependencies? This one in particular is LGPL/GPL and I'm not sure if that is compatible with ISC.

Anyway, I don't really have too much experience with "raw" C, but I'd be willing to have a look at this if it would be about glueing together gmid and a ProxyProtocol library.

omar-polo commented 1 month ago

What is your stance on external dependencies?

The less, the better :)

jokes aside, I prefer to keep dependencies only for the the "hard" things. I won't ever implement TLS by myself, nor libevent. However, from a quick read of the proxy protocol specification, it doesn't seem particularly hard to implement (especially v2), so I'd prefer to just write some custom code for it. (we don't really need all of proxy v2, some parts don't seem useful for gmid, like ALPN.)

OpenSMTPD has an implementation in proxy.c, but there's also some smtpd logic in it.

This one in particular is LGPL/GPL and I'm not sure if that is compatible with ISC.

IANAL, but if it's LGPL it has the "linker" exception and so can be linked to code under different licenses as far as i remember, but see above.

I'd start with a small scope by defining exactly what parts we care about. For example, I believe we can safely implement only the v2 protocol, since it's easier and more performant than v1, and only a subset of all the extensions.

Sir-Photch commented 1 month ago

I believe we can safely implement only the v2 protocol, since it's easier and more performant than v1, and only a subset of all the extensions.

Note that the nginx_stream_proxy_module only supports v1:

https://nginx.org/en/docs/stream/ngx_stream_proxy_module.html#proxy_protocol

tracked in:

https://trac.nginx.org/nginx/ticket/1639

omar-polo commented 1 month ago

oh, that comes as a surprise to me! I haven't spotted the difference between what nginx accepts and what provides in the documentation! It's really unfortunate because for us it means that we have to parse a variable-length header before we can go inside the TLS machinery. v2 would be much more faster and simpler.

This means that to support v1 decently we'll probably need to use tls_accept_cbs(3) instead of tls_accept_socket(3) and roll a small buffering layer. Not a big deal but a bit annoying indeed.

Sir-Photch commented 1 month ago

Does v1 explicitly say that it has a variable-length header? If not, and the only difference is different header lengths between v1 and v2 headers, what about just making this a config setting, to only accept one of the two?

Otherwise, if I understand correctly, you'd read the socket first into a buffer, check if it is proxy_protocol, strip and parse that header, and only then call tls_accept_cbs with that buffer which is truncated?

omar-polo commented 1 month ago

The problem with v1 is that while it guarantees a maximum length, the header itself is variable-length (see for example an header with IPv4 addresses and one with IPv6 addresses).

This means that we can't just read N bytes off the socket and then continue into the TLS machinery, because we don't know how much to read. We can either read one byte at a time (which sucks) or rolling a small buffered layer between the socket and libtls (via tls_accept_cbs(3)).

Neither is a blocker for this feature to be implemented, and if nginx only supports v1 I guess we'll have to cope, but it is a minor annoyance indeed.

Sir-Photch commented 1 month ago

So I have tinkered with server.c a little bit and got the buffer layer working. Are you already working on this?

If not, I can create a draft PR for you to have a look, and to decide, if:

  1. it is rubbish, and you continue, or
  2. it is fine, and I continue.

What do you think?

omar-polo commented 1 month ago

Sorry for the delay, been really busy these last days. Thank you for working on this!

Sir-Photch commented 1 month ago

No hard feelings! I'm not in a hurry.

omar-polo commented 2 weeks ago

Closing this since #30 was merged, thank you!