Support running driver tests on non-linux OSes

bzz commented 5 years ago

Right now the integration test part of the bblfsh-sdk test flow relies on ability to share a /var/run/bblfshctl.sock between Host and an a container running bblfshd, so that a local built driver image could be installed in bblfhsd from that docker instance (instead of usual dockerHub) using the docker-daemon transport.

That is a workaround that works only on linux (as other OSes simiply do not use sockets and communicate though TCP to actual docker daemon running in linux VM), found during previous attempts to archive similar thing under https://github.com/bblfsh/bblfshd/issues/74.

What is required here, is to optionally change/override the way how bblfshd talks to docker in bblfhsh-sdk case to the communication over TCP (similar to how it is possible to override how bblfshctl talks to bblfhsd from socket to TCP using ctl-network/ctl-address CLI options )

As one can see from vendor/github.com/containers/image/docker/daemon/daemon_src.go:38 - the old version of dependency we are using inside runtime does not support it, but it seems to be possible with the latest one.

For convenience, it would make sense to expose this options as both ENV vars and CLI args with the same names, so they could be easily customized on bblshd startup and applied in integration test of bblfsh-sdk test.

creachadair commented 5 years ago

I would advocate that instead of making this switchable, we should switch it on all platforms to use a loopback TCP socket. As far as I can tell that is supported on all OSs we are likely to encounter, and I think there is almost no advantage to having multiple options here.

Of course, we might have to support both during a transition, but I think the goal state should be to wind up with only one method of connection. Loopback is nice because it's still in-kernel, and as long as we are careful not to bind only to the loopback address does not expose the server more widely than the socket does.

dennwc commented 5 years ago

I think it's better to rely on what Docker client provides - it will automatically select the connection based on OS and environment.

Forcing it to loopback will most probably force everyone to reconfigure Docker on Linux.

creachadair commented 5 years ago

I think it's better to rely on what Docker client provides - it will automatically select the connection based on OS and environment.

Forcing it to loopback will most probably force everyone to reconfigure Docker on Linux.

Agreed. Sorry, I wrote imprecisely: I don't mean that we should hard-code loopback in the code itself, just that we should make it use TCP and set it up to use loopback in our own usage.

dennwc commented 5 years ago

Sorry, I don't see how it's different.

If developer installs Docker on Linux, it will use Unix socket by default, and won't listen on any TCP port. Then, after installing an SDK, all commands will try to use TCP and will fail.

creachadair commented 5 years ago

Sorry, I don't see how it's different.

If developer installs Docker on Linux, it will use Unix socket by default, and won't listen on any TCP port. Then, after installing an SDK, all commands will try to use TCP and will fail.

Maybe we're talking about different things? I thought this issue was about the communication between the host and the Babelfish daemon, not the docker server. I don't think it matters what Docker itself uses to communicate with its CLI, but it does matter (if I read this correctly) how the host client communicates with the daemon.

Did I miss something important?

dennwc commented 5 years ago

Sorry for the confusion, the issue is about bblfshd communication with the Docker daemon on the host.

As a part of the integration test suite, we run a bblfshd daemon in Docker. We build a dev version of the driver locally (in Docker) and install it into the running bblfshd instance. For this to work, bblfshd needs to know how to communicate with the Docker daemon to pull the container image from it.

For Linux, Docker runs on the host and is accessible via the Unix socket, which is exposed as a volume to the bblfshd container.

But for macOS, Docker runs in the VM and cannot be accessed the same way as on Linux.

creachadair commented 5 years ago

Ah, okay, I misunderstood the layering here. In that case, you can probably ignore most of what I said above. Thanks for the clarification!

bzz commented 5 years ago

Stranger than fiction: while integration tests hang on 10 min timeout with logs

$ time CI="yes" go run vendor/gopkg.in/bblfsh/sdk.v2/cmd/bblfsh-sdk/main.go test
 ....

docker run --rm --privileged -v /var/run/docker.sock:/var/run/docker.sock bblfsh/bblfshd
+ docker exec sha256:36d10eb1a380839c848130dd116f77d95f480bf78c59959490f0e5cb566053fa bblfshctl driver install javascript docker-daemon:sha256:36d10eb1a380839c848130dd116f77d95f480bf78c59959490f0e5cb566053fa

If I manually list installed drivers in that container - it seems to have a driver installed 😕

$ docker exec 61158be3fa9b bblfshctl driver list
+------------+---------------------------------------------------------------------------------------+--------------+--------+-----------+--------+-------------+---------------+
|  LANGUAGE  |                                         IMAGE                                         |   VERSION    | STATUS |  CREATED  |   OS   |     GO      |    NATIVE     |
+------------+---------------------------------------------------------------------------------------+--------------+--------+-----------+--------+-------------+---------------+
| javascript | docker-daemon:sha256:36d10eb1a380839c848130dd116f77d95f480bf78c59959490f0e5cb566053fa | dev-8c08ccf1 | beta   | 5 minutes | alpine | 1.10-alpine | node:8-alpine |
+------------+---------------------------------------------------------------------------------------+--------------+--------+-----------+--------+-------------+---------------+
Response time 441.6µs

Or am I exec'ing into the wrong container here?

dennwc commented 5 years ago

Yeah, it seems like a different container. The test runs and kills it's own bblfshd instance.

creachadair commented 5 years ago

On macOS, it turns out that Docker symlinks /var/run/docker.sock to the "real" location under the invoking user's $HOME. I manually updated the command to expand the link on the LHS of the bind mapping, and it appears to work:

docker run --rm --privileged -v $HOME/Library/Containers/com.docker.docker/Data/docker.sock:/var/run/docker.sock bblfsh/bblfshd

I hand-patched the SDK to do this evaluation when starting up the driver, and it is able to connect. However, I didn't fully push this change down the stack, and the bblfsh/bblfshd image I was using was pre-baked with the old value and failed partway through.

Nonetheless, I think this strategy has the potential to succeed. The only real change I made was to add

// LocalSocket returns the fully expanded path of the Docker socket on the                                                                                                                                      
// current machine, by expanding symbolic links in Socket.                                                                                                                                                      
func LocalSocket() string {
   if v, err := filepath.EvalSymlinks(Socket); err == nil {
      return v
   }
   return Socket // fallback                                                                                                                                                                                    
}

and to replace the constant strings with calls to that function.

creachadair commented 5 years ago

As you might expect, it turns out to be a little more complicated than that: The code that talks to the daemon is buried inside the container image library and to control it we would have to plumb through a SystemContext carrying the bound DockerDaemonHost we want to use. Right now it is set to nil which falls back on DefaultDockerHost.

So this is reachable, but will require some thought about how to wire it up. I'll keep at it.

creachadair commented 5 years ago

Interestingly, the docker package suggests maybe we could override this in the environment.

creachadair commented 5 years ago

…except, we don't use that package, we use https://godoc.org/github.com/ory/dockertest/docker instead. And the config there doesn't expose the host in the same way (I think).

bblfsh / sdk

Support running driver tests on non-linux OSes #370