danhigham / cloudfoundry-tmate-buildpack

A buildpack to install tmate to enable remote tmux sessions
13 stars 17 forks source link

master.tmate.io lookup failure #2

Open nota-ja opened 9 years ago

nota-ja commented 9 years ago

I have recently deployed a dummy app using cloudfoundry-tmate-buildpack onto a number of private CF environment.

All but one of those environment are working fine. The exceptional one did not show any connection information in cf logs <APPNAME> --recent

2015-02-04T19:11:57.84+0900 [App/0]   ERR 2015/02/04 10:11:57 Starting tmate...
2015-02-04T19:11:57.84+0900 [App/0]   ERR 2015/02/04 10:11:57 1000
2015-02-04T19:11:57.84+0900 [App/0]   ERR 2015/02/04 10:11:57 1000

No following output was given in the log stream.

So I have managed to modify launch in cloudfoundry-tmate-buildpack to produce verbose logs and modify compile script to use given binary of launch if available.

After I made those changes, I deployed the dummy app with the modified cloudfoundry-tmate-buildpack, then I found following lines in tmux-server-45.log:

[tmate] master.tmate.io lookup failure. Retrying in 10 seconds (non-recoverable failure in name resolution)
[tmate] Looking up master.tmate.io...
[tmate] master.tmate.io lookup failure. Retrying in 10 seconds (nodename nor servname provided, or not known)
[tmate] Looking up master.tmate.io...
[tmate] master.tmate.io lookup failure. Retrying in 10 seconds (nodename nor servname provided, or not known)
...

The lines were repeated every 10 seconds.

I searched the web with "master.tmate.io lookup failure" and found this issue: https://github.com/nviennot/tmate/issues/32 .

And I have made some research on my own in an "OK" environment (a bosh-lite CF v194 running on my local machine) and the "NG" environment (a v172-based CF built by micro bosh on CloudStack-based VPC).

In both environments, name resolution of master.tmate.io, outbound HTTP access to the Internet, and outbound SSH connection to master.tmate.io are OK.

The only difference I've found so far is that I can't resolve the hostname of IP address by reverse DNS lookup from inside of the warden container where my dummy app is running in the NG env. Reverse DNS lookup works fine from the OK env's warden container. But I'm not sure it is related to the problem or not.

vcap@18clp6rlq2k:~$ curl http://portquiz.net:8080/
Port 8080 test successful!
Your IP: ***.***.***.***
vcap@18clp6rlq2k:~$ dig -x ***.***.***.***

; <<>> DiG 9.7.0-P1 <<>> -x ***.***.***.***
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54834
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;***.***.***.***.in-addr.arpa.  IN  PTR

;; ANSWER SECTION:
***.***.***.***.in-addr.arpa. 21599 IN  PTR ***.***.***.***.dy.bbexcite.jp.

;; Query time: 275 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Feb  6 04:34:11 2015
;; MSG SIZE  rcvd: 90
vcap@18e7k2hjjf4:~$ curl http://portquiz.net:8080/
Port 8080 test successful!
Your IP: ***.***.***.***
vcap@18e7k2hjjf4:~/app$ dig -x ***.***.***.***

; <<>> DiG 9.7.0-P1 <<>> -x ***.***.***.***
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47165
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;***.***.***.***.in-addr.arpa.  IN  PTR

;; Query time: 4 msec
;; SERVER: 10.0.0.177#53(10.0.0.177)
;; WHEN: Fri Feb  6 04:26:23 2015
;; MSG SIZE  rcvd: 46

(IP addresses are masked manually)

Any thoughts? Thanks in advance.

nota-ja commented 9 years ago

So I have managed to modify launch in cloudfoundry-tmate-buildpack to produce verbose logs and modify compile script to use given binary of launch if available.

Here it is: https://github.com/nota-ja/cloudfoundry-tmate-buildpack/tree/my-master Sorry to forget including this in the first post.