Closed sqs closed 9 months ago
Leaving open until we confirm the fix. What needs to be done:
sg start app
or a release binary (not the macOS bundle)sg start app
or a new release binaryNote that Linux behavior must be tested because #50020 there are differences with docker.host.internal
not existing there, so best we make sure it is working.
Assigning myself to this (as I also need to make sure I can run sg start app
).
I don't have a Linux machine though, @fkling you're on Linux right?
Confirmed working on macOS:
Steps followed:
sg start app
(ran from commit: 82eca3c247)HEAD
Observations:
@umpox and I figured out why it doesn't work on Linux but it's not clear what the best resolution is.
The error I'm getting is
Head "http://host.docker.internal:3082//.executors/scip/upload": dial tcp 172.17.0.1:3082: connect: connection refused
As per @slimsag's suggestion I confirmed that the docker container was started with --add-host=host.docker.internal:host-gateway
. So what's going on?
It all boils down to the fact that the sourcegraph app is listen on interface 127.0.0.1
, but docker gateway's interface/IP is 172.17.0.1
. I.e. host.docker.internal
resolves to the the 172.*
address but the app is not listening on that interface.
To confirm this I made the following experiment:
Start a dummy HTTP server on 3082
(or any other port, it doesn't matter) via
while true; do echo -e "HTTP/1.1 200 OK\r\n$(date)\r\n\r\n" | nc -l -p 3082 -s 127.0.0.1 -q 1; done
This simulates the upload endpoint.
Connect to the endpoint from inside docker via
docker run --rm --add-host=host.docker.internal:host-gateway curlimages/curl curl --head 'http://host.docker.internal:3082//.executors/scip/upload'
This fails with a similar error:
curl: (7) Failed to connect to localhost port 3082 after 0 ms: Couldn't connect to server
NOTE: Various sources on the Internet seem to suggest that --add-host host.docker.internal:host-gateway
should just work on Linux too, but these sources often don't state which interface(s) the target service listens on. If it's on all interfaces (0.0.0.0
) then yes, it will work. If it's on localhost
then no, or at least not in my case.
There are two ways to make this work:
0.0.0.0
or omitting -s <IP>
from the test command above), or explicitly listening on docker's gateway interface (172.17....
, whatever it is called) both work.--network=host --add-host=host.docker.internal:127.0.0.1
. I was able to confirm that with the sourcegraph binary itself by hacking these options into the command, as shown in the screenshot.Listening on all interfaces isn't a good solution for obvious reasons. If the sourcegraph binary only needs to expose those services to docker then having those services listen on the docker gateway interface seems to the best solution. Maybe that would even work in a OS agnostic way. Ideally we could find a solution that works on every OS. If not then we should have the ability to configure host or container options for different builds.
Awesome work digging into this! That is a super helpful write-up!
Listening on all interfaces isn't a good solution for obvious reasons. If the sourcegraph binary only needs to expose those services to docker then having those services listen on the docker gateway interface seems to the best solution.
This seems like a sane approach to me; @eseliger do you have any thoughts/opinions here?
Listening on the docker default network as well sounds reasonable to me for the App use-case. Not sure how easy it'd be to detect if that port needs to be claimed or not. It's weird to me that macOS and Linux differ here 😀
One thing to keep in mind, docker installations on Linux are generally more customized than they are on Mac where virtually everyone just uses whatever docker gives them. On Linux, other firewall things and such could still break us, but I guess that's "expected" if you configure your firewall in weird ways.
@eseliger are executors the only part that are run via docker in app?
I'm now thinking that it might make sense to create a separate, sourcegraph app specific network and have App listen on that interface specifically. Do you, or @slimsag know which services need to be accessible to docker and which need to be accessible on the host? (e.g. the web frontend obviously needs to accessible from the host).
This is what I'm seeing when I run sg start app
:
tcp 0 0 127.0.0.1:3434 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3178 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3180 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3181 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3182 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3184 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3188 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3189 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3082 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:3090 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:6996 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:4319 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1904/.bin/sourcegra
tcp 0 0 127.0.0.1:9991 0.0.0.0:* LISTEN 1904/.bin/sourcegra
What are these doing and which ones need to be exposed only to executors/other docker processes? Sorry for all the questions, my knowledge of the backend is limited... pointing me in the right direction (e.g. where all of this is configured) probably suffices.
@eseliger are executors the only part that are run via docker in app?
To my understanding, yes, that's correct.
I'm now thinking that it might make sense to create a separate, sourcegraph app specific network and have App listen on that interface specifically. Do you, or @slimsag know which services need to be accessible to docker and which need to be accessible on the host? (e.g. the web frontend obviously needs to accessible from the host).
I think it could generally make sense for the executor to create a separate network to run the docker containers in, this is generally best practice for docker containers anyways to not rely on the default network. That doesn't even need to be app specific. The only service executors need to be able to reach is the public facing HTTP endpoint of the frontend service. That would probably be port 3180(?).
Is the separate docker network something you guys are able to take on or should we raise this with the executors people?
@eseliger
Is the separate docker network something you guys are able to take on or should we raise this with the executors people?
Given my limited understanding of the backend I'd appreciate it if someone else can look into this (probably more efficient).
The only service executors need to be able to reach is the public facing HTTP endpoint of the frontend service. That would probably be port 3180(?).
So this port is meant to be "world" accessible? Because that's what we currently seem to be doing. If that's fine, then maybe there is nothing to be done to make it work on Linux in production. Unfortunately I cannot build a Linux binary with the latest changes to confirm my assumptions. But it looks like only when running via sg
we are overriding the interface for the frontend which causes the service to only listen on localhost
and not on all interfaces. I don't know why we are doing this though.
In other words: While using a separate network might be the better way anyways, it might still not work in dev mode, due to the special setup.
So this port is meant to be "world" accessible? Because that's what we currently seem to be doing. If that's fine, then maybe there is nothing to be done to make it work on Linux in production. Unfortunately I cannot build a Linux binary with the latest changes to confirm my assumptions. But it looks like only when running via sg we are overriding the interface for the frontend which causes the service to only listen on localhost and not on all interfaces. I don't know why we are doing this though.
Correct! Only the globally accessible interface of sourcegraph needs to be reachable by an executor. Let me know if that works or if we need to do more work here :)
Repro:
The reason appears to be that the
docker run
command for the upload step is missing--add-host=host.docker.internal:host-gateway
, which is added when theexecutors.frontendURL
site config setting URL has the hosthost.docker.internal
. ThedockerOptions
func that sets this seesc.FrontendURL
as http://localhost:3080, which is NOT the URL that the commands running inside the executor's Docker containers can access the host Sourcegraph App at.The fix might just be consulting
c.ExecutorsFrontendURL()
instead ofc.FrontendURL
indockerOptions
, or maybe it is more complex and Sourcegraph App needs to listen on more than justlocalhost
so that Docker containers can contact it.Upload step command:
Log output: