clearcontainers / proxy

Hypervisor based containers proxy
Apache License 2.0
32 stars 15 forks source link

vendor: Vendor virtcontainers #196

Closed amshinde closed 6 years ago

amshinde commented 6 years ago

vendor: Vendor virtcontainers

This is to pull in changes related to --no-new-privileges flag and Linux capabilities.

Virtcontainers shortlog: fb1eecd mount: Fix unmount of dangling bind-mounts d7462c7 pkg/oci: Clarify resource calculation comment 027aab8 qemu: adjust QMP naming to avoid non-unique truncation 0c4064e capabilities: Pass capabilities to hyperstart. e20ba9d oci: Add support for capabilities 6776dd9 shim: Correct kata debug flag b307c08 qemu: refactor/simplify addDevice function 747d364 vhost-user: rewrite to use interfaces/embedded types cc67fb0 vhost-user: enabling for vhost-user network devices f5587cf device: make a more generic function for hypervisor args d6f0600 gitignore: Add new shim binary to gitignore list bf8359f gitignore: Add new shim binary to gitignore list c30fd9a ci: Install missing dep tool d1bb792 kata_agent: Signal the kata shim 08c96c2 shim: Generalize stopShim 3e86f7b vendor: Force kata containers agent vendoring 34952bb shim: Factorize the shim config structure between kata and CC a7e244a shim: Factorize shim execution code eb8befb shim: Add a Kata shim mock implementation 6da9685 shim: Add a Kata Containers shim type 18f46de kata_shim: Initial implementation 60a446a container: Generate process token when not set 4c2c9a4 mounts: Fix bug while checking if /dev was bind-mounted 67fcb6d pkg/oci: honour CPU period and quota 1a3de59 agent: Add kata exec, stopContainer and killContainer 4f92997 annotations: Update tests to use package prefix b3da3de mount: Fix tests for bindMountContainerRootfs function aa75a0e kata_agent: Implement vmURL and setProxyURL 53f093d kata_agent: Initial VSOCK support bc302d2 kata_agent: Implement the validate function cb7fac2 kata_agent: Rename shared dir paths 0b95eda kata_agent: Create and start container implementations c4a4be4 mount: Gather the entire bind mount API be96f34 vendor: Update for Kata Containers c0692ca annotations: Move OCI annotations to the annotation package 1a638c0 ci: Handle complex revendoring cases fe726af kata_agent: Implement pod start and stop ops 3d1afe4 kata_agent: Initialize gRPC client f6948d8 agent: Add new agent type for Kata Containers 7d35db6 kata_agent: Add Kata agent configuration b2fe3df hyperstart: Rename to hyperstart_agent f86cd11 kata_agent: Initial implementation 414e156 qemu: improve kernel boot time bdde7bb pod: Add important comment for Cmd type a217958 oci: Pass the NoNewPrivileges flag to the agent. f683602 oci: Add support for NoNewPrivileges in oci spec ddd89b7 cheanup: Remove vendored "golang.org/x/crypto" package. fd6e357 cleanup: Remove sshd agent b7c19b8 cni: update function names for consistency 847eaaa network: scan network one less time dc4836a network: unique ID is not unique 8449f56 network: refactoring and cleanup of CNI path d6f9690 vendor: Revendor govmm for VSOCK support

Fixes #195

Signed-off-by: Archana Shinde archana.m.shinde@intel.com

clearcontainersbot commented 6 years ago

kubernetes qa-passed ๐Ÿ‘

jodh-intel commented 6 years ago

I take it this is just a general update (or are there particular commits we need)?

lgtm

Approved with PullApprove

jodh-intel commented 6 years ago

Ah - I see it's for NoNewPrivileges support. It would be useful to record that in the commit itself ideally.

clearcontainersbot commented 6 years ago

kubernetes qa-failed ๐Ÿ‘Ž

amshinde commented 6 years ago

@jodh-intel I have added additional description in the commit message, however I am seeing the unit tests for proxy failing. Not able to reproduce locally though.

clearcontainersbot commented 6 years ago

kubernetes qa-failed ๐Ÿ‘Ž

jcvenegas commented 6 years ago

@chavafg any idea why k8s failed?

chavafg commented 6 years ago

@jcvenegas @amshinde, The kubernetes pods cannot be created successfully. There are cc-runtime processes hanged and errors in the proxy log, but there are not errors in the runtime log:

Hanged processes:

fuentess@proxy196:~/go/src/github.com/clearcontainers/tests/integration/kubernetes$ ps -ef | grep cc-
root      26248      1  0 22:20 ?        00:00:06 /usr/bin/dockerd -D --add-runtime cc-runtime=/usr/local/bin/cc-runtime --default-runtime=cc-runtime --storage-driver=overlay2
root      80111      1  0 23:21 ?        00:00:00 /usr/libexec/clear-containers/cc-proxy -uri unix:///run/virtcontainers/pods/26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb19537aa607/proxy.sock -log debug
root      80773  68729  0 23:21 ?        00:00:00 /usr/local/bin/cc-runtime start e883c6a30dc488d68831b6fe297ce9880cd6505aacfb1322de782831a187ce46
root      80782  68729  0 23:21 ?        00:00:00 /usr/local/bin/cc-runtime state 26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb19537aa607
fuentess  83365   2233  0 23:37 pts/1    00:00:00 grep --color=auto cc-

From cc-proxy journal:

Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.80033088Z" level=error msg="error serving client: read unix /run/virtcontainers/pods/26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb1953
7aa607/proxy.sock->@: use of closed network connection" client=9 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.800441781Z" level=info msg="connection closed" client=9 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.800362381Z" level=error msg="error serving client: read unix /run/virtcontainers/pods/26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb195
37aa607/proxy.sock->@: use of closed network connection" client=5 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.800522181Z" level=info msg="connection closed" client=5 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.900618035Z" level=info msg="client connected" client=15 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.900702336Z" level=info msg="client connected" client=16 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.902003142Z" level=info msg="client connected" client=17 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.902982246Z" level=info msg="connection closed" client=15 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.903077846Z" level=info msg="connection closed" client=16 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:20 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:20.903153747Z" level=info msg="connection closed" client=17 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:50 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:50.212217603Z" level=error msg="timeout waiting for process with token VPKhiwMcOqnbMsd5QYcHSl193pWUhmvf2thR0q4RMdM=" name=cc-proxy pid=80111 secti
on=io source=proxy vm=26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb19537aa607
Jan 24 23:21:50 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:50.212784906Z" level=debug msg="lost the shim for %sVPKhiwMcOqnbMsd5QYcHSl193pWUhmvf2thR0q4RMdM=" name=cc-proxy pid=80111 section=io source=proxy
vm=26298cfb32401af12d4c829025d21ce95543898dec29a708c8c5cb19537aa607
Jan 24 23:21:50 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:50.213218808Z" level=error msg="error serving client: timeout waiting for process with token VPKhiwMcOqnbMsd5QYcHSl193pWUhmvf2thR0q4RMdM=" client=
13 name=cc-proxy pid=80111 source=proxy
Jan 24 23:21:50 proxy196 cc-proxy[80111]: time="2018-01-24T23:21:50.213285408Z" level=info msg="connection closed" client=13 name=cc-proxy pid=80111 source=proxy
amshinde commented 6 years ago

@chavafg Looks like the process within the container failed to start, likely caused by failure of newcontainer command. Do you see any errors from the agent of type "ERROR received from VM agent" ?

devimc commented 6 years ago

@amshinde https://github.com/containers/virtcontainers/pull/581 should be merged first, and then you have to include it together with https://github.com/clearcontainers/proxy/pull/208 in this PR

devimc commented 6 years ago

... and you will need https://github.com/clearcontainers/agent/pull/202 in the agent

amshinde commented 6 years ago

@devimc I dont understand why that change is required. I have added my comments in the issue. Is there something different going on with k8s?

devimc commented 6 years ago

@amshinde take a look to the code, cc-proxy sends/writes a pointer to Process and cc-agent receives/reads a Process, and k8s sends a lot of data, probably that's the reason

amshinde commented 6 years ago

@devimc cc-proxy should marshal the pointers correctly to json, it really should not matter if proxy uses pointers/object as long as it is marshalled correctly. Please correct me if I am missing something. I suspect this may be due to data going over 4096 bytes.

devimc commented 6 years ago

@amshinde if that is the case, then just apply clearcontainers/agent#202 and this patch, if that works in k8s then you're right and containers/virtcontainers#581 and #208 are not needed

amshinde commented 6 years ago

@chavafg Can you try the k8 tests wiith the latest agent code that has @devimc's changes: https://github.com/clearcontainers/agent/pull/202

amshinde commented 6 years ago

@chavafg @devimc Let me know if that works for you, I really want to get this PR merged for the user to add capabilities. This has been sitting around due to proxy CI breaking due to go version as well.

amshinde commented 6 years ago

cc @egernst

sameo commented 6 years ago

LGTM @chavafg Could you please run the k8s tests?

Approved with PullApprove Approved with PullApprove

chavafg commented 6 years ago

@amshinde I tested with latest agent code (which includes clearcontainers/agent#202) and I got these errors:


Jan 29 15:00:03 proxy196 cc-proxy[16258]: time="2018-01-29T15:00:03.917007275Z" level=debug msg="{\\\"level\\\":\\\"info\\\",\\\"msg\\\":\\\"startpod_end\\\",\\\"name\\\":\\\"cc-agent\\\",\\\"pid\\\":160,\\\"tim
e\\\":\\\"2018-01-29T15:00:03.866537632Z\\\"}" name=cc-proxy pid=16258 source=qemu vm=929359b9a7c01bd0cb33118675ae1ce3ec17ce73892f5eb4db79991e09bcb320
Jan 29 15:00:03 proxy196 cc-proxy[16258]: time="2018-01-29T15:00:03.917455671Z" level=debug msg="{\\\"channel\\\":\\\"ctl\\\",\\\"command\\\":\\\"startpod\\\",\\\"error\\\":\\\"Could not setup the network: Could
 not setup network routes: Could not add route dest(10.244.0.0/16)/src()/gw(10.244.0.1)/dev(eth0): network is unreachable\\\",\\\"level\\\":\\\"info\\\",\\\"msg\\\":\\\"command failed\\\",\\\"name\\\":\\\"cc-age
nt\\\",\\\"pid\\\":160,\\\"time\\\":\\\"2018-01-29T15:00:03.866992628Z\\\"}" name=cc-proxy pid=16258 source=qemu vm=929359b9a7c01bd0cb33118675ae1ce3ec17ce73892f5eb4db79991e09bcb320
Jan 29 15:00:03 proxy196 cc-proxy[16258]: time="2018-01-29T15:00:03.917798869Z" level=info msg="connection closed" client=2 name=cc-proxy pid=16258 source=proxy
Jan 29 15:00:05 proxy196 cc-proxy[15869]: time="2018-01-29T15:00:05.394258694Z" level=debug msg="[   16.672444] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2113955381f, max_idle_ns: 440795222875 ns"
 name=cc-proxy pid=15869 source=qemu vm=4f4bd142cab89925ed054635ab190ede6b022e50814271a192f402264d5770aa
Jan 29 15:00:05 proxy196 cc-proxy[9453]: time="2018-01-29T15:00:05.659133466Z" level=debug msg="[  203.853444] systemd-journald[120]: Sent WATCHDOG=1 notification." name=cc-proxy pid=9453 source=qemu vm=dea99966
78d46ba9800595acc608423ad6f3fbe3e909bc34648047dfee2b85ed
Jan 29 15:00:05 proxy196 cc-proxy[9293]: time="2018-01-29T15:00:05.731927008Z" level=debug msg="[  207.949856] systemd-journald[125]: Sent WATCHDOG=1 notification." name=cc-proxy pid=9293 source=qemu vm=f9e5b314
f280d4bd3acba116b9c6f73ff3b54daec47461fba1e735bbf47c651d

which I think is https://github.com/clearcontainers/agent/issues/182

egernst commented 6 years ago

@chavafg -- just catching up on the issue. Seems that's not a new error. Does that occur on the baseline (ie, whatever is on master for proxy) right now as well? If so, I'd like to see this get merged.

chavafg commented 6 years ago

@egernst the results above only happens with this PR and latest code from clearcontainers/agent as @amshinde suggested.

chavafg commented 6 years ago

Seems like we are carrying clearcontainers/agent#182 on the latest agent, which I think should be solved first in order to be able to use it as base for a new image.