OpenBazaar / openbazaar-go

OpenBazaar 2.0 Server Daemon in Go
MIT License
993 stars 283 forks source link

name resolution failures running openbazaard in docker #1023

Open guruvan opened 6 years ago

guruvan commented 6 years ago

Using openbazaar/server:v0.12.0 (and my own image which substitutes alpine for scratch for the final image created), apparently any service that requires openbazaard to resolve DNS names fails.

  18:59:01.991 [fetch] [ERROR] Failed to fetch from  
  https://api.bitcoincharts.com/v1/weighted_prices.json Get 
  https://api.bitcoincharts.com/v1/weighted_prices.json: dial tcp: lookup api.bitcoincharts.com: no 
  such host
  18:59:01.991 [fetchCurrentRates] [ERROR] Failed to fetch bitcoin exchange rates

tcpdump shows that no lookups are being performed at all

with my alpine based ob container, I've tested name resolution inside the container, and that's working fine from a shell.

nodes are configured as per this config: ob://QmY8HpD6iLHqZUErP6PZhW9QGiqFehFRBbAGVLcA34mxdk/store/docs-ob-server-config-remote-server-config-example

Nodes are also running with the -l debug switch, and there's really nothing more in the logs.

tyler-smith commented 6 years ago

Strange, I've not seen this. Our search infrastructure and other systems use the docker image and haven't had this issue.

Am I correct in understanding that it's not working with an Alpine base either? That's weird because my only guess would be something missing in SCRATCH.

@christroutner Have you seen this?

guruvan commented 6 years ago

scratch is an empty image, so there's nothing missing there. there's no user-serviceable parts in the openbazaar/server image, its a minimal /dev /proc /sys + the above files, and additions to /etc from docker

I've got a few nodes each running on openbazaar/server:v0.12.0 mazaclub/openbazaar:v0.12.0 and mazaclub/openbazaar:master and they're all running the same. Containers all resolve names, but openbazaard doesn't and doesn't seem to try. One node is running on openbazaar/server:opengateway (which is v0.11.1 IIRC)

To be clear, Here's the Dockerfile used

# Build stage - Create static binary
FROM golang:1.9
WORKDIR /go/src/github.com/OpenBazaar/openbazaar-go
COPY . .
RUN go build --ldflags '-extldflags "-static"' -o /opt/openbazaard .

# Run stage - Import static binary, expose ports, set up volume, and run server
FROM alpine:latest
EXPOSE 4001 4002 9005
VOLUME /var/lib/openbazaar
COPY --from=0 /opt/openbazaard /opt/openbazaard
COPY --from=0 /etc/ssl/certs/ /etc/ssl/certs/
ENTRYPOINT ["/opt/openbazaard"]
CMD ["start", "-d", "/var/lib/openbazaar"]

it is definitely weird, and definitely appears to be a docker-related issue, but I can't imagine what's going on.

To help isolate it I copied the binary from the openbazaar/server:v0.12.0 image to the host (nixos) and made a new node. It found bitcoin peers immediately, and shows no tcp lookup issues in the logs. I ran the docker image, with the same node directory, and it's got the issue. I let this node be free of SSL for testing, and checked with -l debug and without it. (just in case that turns on excess code)

tried running the container in privileged mode, and with --net host to rule out most everything I can think of.

openbazaard works fine on the host AFAICT, and not in docker. running it directly on the host produces no errors, and smtp connection works to gmail - name resolution is working.

docker version is Docker version 17.09.1-ce, build 19e2cf6259bd7f027a3fff180876a22945ce4ba8

I'll spin up a ubuntu 18.04 machine in the morning and test this same node running in docker on that, maybe there's something wrong with how nix has the namespace configured

guruvan commented 6 years ago

OK. Appears I've tracked this down a long standing issue in docker - it's not clear why you guys aren't seeing this, but that's kinda docker for you.

Per this docker forum thread they're setting an obscure option for no reason that I can imagine (ndots:0) in /etc/resolv.conf which I confirmed to be in my containers.

Changing this while openbazaard is running did not appear to sole the issue, but changing it prior to running the daemon did solve the problem. No hostname lookup failures when it checks exchange rates, smtp connection works, buy buttons that rely on exchange rates work.

I've added an entrypoint.sh to my docker image to modify the contents of /etc/resolv.conf prior to running openbazaard. This, of course requires a minimal shell, and alpine + ash seems to work fine, with a few compile errors that should be investigated.

I can file a pr if you like

tyler-smith commented 6 years ago

@guruvan Nice work tracking this down. Glad you've got something working for now. It's definitely a strange issue that I haven't seen before.

I can file a pr if you like

I'd definitely like to see the change you made to /etc/resolve.conf but I want to research the issue some more before modifying our Dockerfile. Could you push your changes to a branch somewhere so I can check them out?

guruvan commented 6 years ago

https://github.com/mazaclub/openbazaar-go/tree/fix/docker-ndots

guruvan commented 6 years ago

automated builds on dockerhub are setup for this now - https://hub.docker.com/r/mazaclub/openbazaar-go/

mazaclub/openbazaar-go:latest is just these changes based on v0.12.0 as is the above branch