docker run -it doesn't work. container will stuck in starting mode.

malikkal commented 7 years ago

docker run without -d doesn't work. Engine version: v0.8.0-7315-c8ac999

Steps to reproduce.

Deploy a VCH with separate management and public network and with or without additional container networks. Run a container out of the registry with -it. In this case a Redhat 6.8 base image. Generic error Error response from daemon: Server error from portlayer: unable to wait for process launch status Container will be stuck at 'Starting' when issuing a ps -a

However, the container will run without issues when deployed via Admiral or with -itd.

In summary no interactive session with container possible.

Also refer #4212, #3315 and #4223. (Similar issues)

VCH routes for reference.

malikkal commented 7 years ago

This is resolved. probably a bug?

Enable mode 2 for rp_filter in the VCH host and make it permanent. echo 2 > /proc/sys/net/ipv4/conf/default/rp_filter sysctl -w "net.ipv4.conf.all.rp_filter=2"

For details refer:

http://www.slashroot.in/linux-kernel-rpfilter-settings-reverse-path-filtering

hickeng commented 7 years ago

@malikkal given the symptoms I assume this is related to setup of the network serial port connection between ESX host and the endpointVM (protocol:tcp src:esxhost dest:endpoint.management.net:2377). Does your comment mean that:

the esxhost is not routable from the endpointVM's management interface (I presume eth0 as we do not rename the management interface), or
the underlying routing from esx to endpointVM means the packets are coming in on a different interface (I would guess client in this case)

Altering the reverse packet filter opens up the possibility of spoofing, with someone pretending to be an ESX host creating a backchannel. While this should be handled at multiple levels (#2849) it's a weakening of the default configuration. I'd like to understand the specific issue you're seeing as it's unexpected if there's a separate management network.

If it's the first option I'm hoping this can be addressed by adding more permissive routes on the management interface, with rp_filter=2 being a fallback approach.

hickeng commented 7 years ago

For implementing a global rp_filter change - update: https://github.com/vmware/vic/blob/master/isos/appliance/nat-setup#L30

To address more permissive routes we would need to: a. require that the management network config have a netmask that's broad enough to encompass the ESX hosts as well as the vSphere endpoint, or b. inspect the hosts in the cluster and add specific routes, either for a combined subnet or for the hosts themselves. c. determine if the hosts are routable on the management interface and, if not, set rp_filter=2 hoping that the ESX originated traffic has a route to the endpoint at all.

@corrieb @kreamyx this is another vote for the communication to be injected via vmci and routed at the infrastructure layer where the hosts are assumed to have connectivity for vmotion and friends to function.

This is also a reminder to re-introduce the post-install functional correctness checks:

pull an image - verify registry config
interactive container on each host - management network routing and ESX firewall
n:m container ping tests - checks the DPG function of the bridge network

hickeng commented 7 years ago

@malikkal If you're using vSAN, would you mind trying docker logs to stream logs from a running container? With the container live we need to go direct to the owning host in order to access the output.log file instead of relaying through vCenter. This would also be impacted by an inability to route to the host on the management network, meaning this would only work if the public network (default route) has a route back to the hosts.

malikkal commented 7 years ago

@hickeng Yes, its related to setup of the network serial port connection between ESX host and the endpointVM. Many thanks for the explanation during the Webex, which prompted to look further. Appreciate it.

In our case all the subnets used for VCH are routable. Also, the the ESXi management is routable.
yes, they were coming in via the client interface.

Sorry, we don't use vSAN yet. BTW, can I redirect selected logs to loginsight?

hickeng commented 7 years ago

@malikkal My pleasure :) Wasn't sure if you were the same person, but seemed likely given the scenario. If you add additional routing targets via --management-network-gateway Gateway for the VCH on the management network, including one or more routing destinations in a comma separated list, e.g. 10.1.0.0/16,10.2.0.0/16:10.0.0.1 does that remove the need for rp_filter?

We don't currently support direct loginsight integration for container logs. Given the logs are currently being persisted on the datastore in the containerVM folder our available approaches at this time are:

allow customization of the serial port backing so a remote destination can be specified (i.e. the log aggregator) - this would require the aggregator be routable from the ESX hosts as with the interactive connection.
provide a mechanism to harvest the logs from the datastore (preferably without round-tripping to the endpointVM but could do if needed)
attach all containers needed log aggregation to a specific container network configured to only allow access to a remote syslogd endpoint and use guest networking in the containerVMs to do the forwarding (problematic if running with --net=none).
forward logs via endpointVM NAT - problematic for all the same reasons as having other data paths routed through there.

Basically there are ways to do it that are trivial to add, but ways to do it without adding complexity or burden to container usage involve work - I'm biased against anything that mandates a specific network requirement for the containerVMs. I do not know enough about operational requirements to know if a scheduled batch collection is feasible or whether live log data is required so I err towards thinking about the latter.

mhagen-vmware commented 7 years ago

@hickeng Can you estimate and prioritize this for me since you have been working on this? And I guess rename this issue to capture the desire to support direct loginsight integration?

hickeng commented 7 years ago

@mhagen-vmware I'm leaving this issue open and for the correct configuration of rp_filter and management network routing as that was the original. I've split out #4771 to record the loginsight request.

We could do with a response from @malikkal about whether adding additional routing destinations for the hosts via --management-network-gateway addresses the problem (as per this comment) however the need for a way to weaken the filtering for asymmetric routing setups remains regardless.

Sizing will be for an option to all weakening the packet filtering (vic-machine and doc primarily). This should not be a change to the core ISO configuration but a per deployment choice - I've added this to the 1.2 project as we should get this into that release for @malikkal

malikkal commented 7 years ago

Pardon me for the delay here; juggling priorities. I will test this and update here by Tue 18th. Thank you for all the support.

hickeng commented 7 years ago

@stuclem #4816 adds an option --asymmetric-routes to deal with the case where we have genuinely asymmetric routes and adding destinations to the --management-network-gateway option will not suffice.

The result of this being true is to set rp_filter to "loose" mode in the endpointVM; see https://en.wikipedia.org/wiki/Reverse_path_forwarding#Loose_mode

Likely symptoms needing this option are:

containers run without -d remain in starting state
connections can be made via port forwarding to running containers from the public network, but not from client or management (assuming if that's desired). @hmahmood should check me here.

In more detail, when starting a container without -d, we will never see the log entries for incomming connections from the newly started container in the portlayer log. The line of interest is the first one in the following snippet that logs vSPC handling a new connection:

Mar 20 2017 04:11:59.010Z INFO  connection received
Mar 20 2017 04:11:59.010Z INFO  sending WILL 0
Mar 20 2017 04:11:59.010Z DEBUG entered write loop
Mar 20 2017 04:11:59.025Z DEBUG Sending command: 251 0
Mar 20 2017 04:11:59.025Z INFO  sending WILL 3
Mar 20 2017 04:11:59.025Z DEBUG Sending command: 251 3
Mar 20 2017 04:11:59.025Z INFO  sending WILL 1
Mar 20 2017 04:11:59.025Z DEBUG Sending command: 251 1
Mar 20 2017 04:11:59.025Z INFO  sending DO 0
Mar 20 2017 04:11:59.025Z DEBUG Sending command: 253 0
Mar 20 2017 04:11:59.025Z INFO  sending DO 3
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 253 3
Mar 20 2017 04:11:59.026Z INFO  sending DO 232
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 253 232
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 254 37
Mar 20 2017 04:11:59.026Z DEBUG Sending WILL command
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 251 0
Mar 20 2017 04:11:59.026Z DEBUG Sending WILL command
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 251 3
Mar 20 2017 04:11:59.026Z DEBUG Sending WILL command
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 251 1
Mar 20 2017 04:11:59.026Z DEBUG Sending command: 253 0
Mar 20 2017 04:11:59.027Z DEBUG Sending command: 253 3
Mar 20 2017 04:11:59.027Z DEBUG Sending command: 253 232
Mar 20 2017 04:11:59.027Z INFO  vspc received KNOWN-SUBOPTIONS command
Mar 20 2017 04:11:59.027Z DEBUG [BEGIN] [github.com/vmware/vic/lib/vspc.(*handler).knownSuboptions:114] handling KNOWN-SUBOPTIONS
Mar 20 2017 04:11:59.027Z DEBUG response to KNOWN-SUBOPTIONS: [255 250 232 1 0 1 2 3 40 41 43 44 45 46 48 70 71 73 80 81 84 85 86 87 82 83 255 240]
Mar 20 2017 04:11:59.027Z DEBUG [ END ] [github.com/vmware/vic/lib/vspc.(*handler).knownSuboptions:114] [592.003µs] handling KNOWN-SUBOPTIONS
Mar 20 2017 04:11:59.027Z INFO  vspc received DO-PROXY command
Mar 20 2017 04:11:59.027Z DEBUG [BEGIN] [github.com/vmware/vic/lib/vspc.(*handler).doProxy:131] handling DO-PROXY
Mar 20 2017 04:11:59.027Z DEBUG response to DO-PROXY: [255 250 232 71 255 240]
Mar 20 2017 04:11:59.028Z DEBUG [ END ] [github.com/vmware/vic/lib/vspc.(*handler).doProxy:131] [42.942µs] handling DO-PROXY
Mar 20 2017 04:11:59.028Z INFO  vspc received VMUUID command
Mar 20 2017 04:11:59.028Z DEBUG [BEGIN] [github.com/vmware/vic/lib/vspc.(*handler).cVMUUID:84] handling VMUUID
Mar 20 2017 04:11:59.028Z INFO  vmuuid of the connected containerVM: 5281ea49d163c912-226e06e2ed4763e2
Mar 20 2017 04:11:59.028Z INFO  attempting to connect to the attach server
Mar 20 2017 04:11:59.028Z DEBUG [ END ] [github.com/vmware/vic/lib/vspc.(*handler).cVMUUID:84] [114.95µs] handling VMUUID
Mar 20 2017 04:11:59.028Z INFO  attach connector: Received incoming connection
Mar 20 2017 04:11:59.038Z DEBUG HandshakeClient: Sending syn.
Mar 20 2017 04:11:59.038Z DEBUG Sending command: 254 44

Users should first check that the management-network-gateway has route entries for the subnets containing both the target vCenter and the corresponding ESX hosts, assuming ESX hosts are accessible via the management network (this is a more secure deployment option). If not then asymmetric routing is required to permit incoming connections from the hosts via one of the other endpointVM networks.

mhagen-vmware commented 7 years ago

lgtm

mhagen-vmware commented 7 years ago

closing now, please reopen if you are still having issues

vmware / vic

docker run -it doesn't work. container will stuck in starting mode. #4613