kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.49k stars 39.5k forks source link

Use Docker to host most server components #19

Closed jbeda closed 8 years ago

jbeda commented 10 years ago

Right now we use salt to distribute and start most of the server components for the cluster.

Instead, we should do the following:

proppy commented 10 years ago

On Sat, Jun 7, 2014 at 10:35 PM, Joe Beda notifications@github.com wrote:

Right now we use salt to distribute and start most of the server components for the master. As we support build and deployment from a local Mac, we don't pre-compile the go scripts but instead ship the source code to the nodes (with salt) and compile at install time.

Instead, we should do the following:

  • Only support building on a linux machine with docker installed. Perhaps support local development on a mac with a local linux VM

We could link to docker instruction about boot2docker: http://docs.docker.io/installation/mac/

  • Package each server component up as a Docker image, built with a Dockerfile
    • We should support uploading these Docker images to either the public index or a GCS backed index with google/docker-registry https://index.docker.io/u/google/docker-registry/.
    • Use the kubelet to run/health check the components. This means the kubelet will manage a set of static tasks on each machine (including the master) and a set of dynamic tasks.
  • The only task that shouldn't run under the docker should be the kubelet itself. We may have to hack in something for (network mode = host) for the proxy.

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes-new/issues/19.

Johan Euphrosine (proppy) Developer Programs Engineer Google Developer Relations

monnand commented 10 years ago

I have some questions just out of curious:

Thank you!

jbeda commented 10 years ago

@monnand I imagine that we will continue to use salt to bootstrap stuff. But we'll be able to reduce some of the more complex salt config.

For example, we currently ship and compile the source everywhere where it is run. If we start building docker images, we can precompile the binaries before they are run.

I'm thinking that we'll follow the example of Docker itself and do the build process in docker containers. This, with boot2docker, could lead to a good dev flow for Mac OS X.

monnand commented 10 years ago

@jbeda Thank you!

You also mentioned that kubelet should not run under the docker. Is it for technical reason, or other? I do not see any technical difficulties to run kubelet in docker. Or did I miss something?

jbeda commented 10 years ago

We may be able to run the kubelet under docker, but most likely we'll want it to have a whole machine view and expanded privs. Running it under a cgroup container is totally doable. namespaces? I'm not so sure if we can make that happen.

Another way of looking at this is that I think of the kubelet as operating at the same level as Docker itself (and perhaps merging with Docker at some point?) and so it should run outside of Docker.

monnand commented 10 years ago

@jbeda Correct me if I'm wrong. (I'm not saying that kubelet should run inside a docker container. I'm just trying to see what are the technical difficulties here.)

As far as I know, kubelet only needs to communicate with docker through docker remote api, which is either trhough a unix socket or a remote IP/port pair. Does it need to read/write cgroup's filesystem? In either case, it seems that we could mount the /var/run onto the container and run kubelet inside that containers.

We are currently doing this in cadvisor, which runs inside a docker container but can communicate with docker daemon and read information from the cgroup filesystem. The container could still run inside its own namespace, but communicate with docker daemon through the mounted volume. We use the following command to run cadvisor inside a docker container:

sudo docker run \
  --volume=/var/run:/var/run:rw \
  --volume=/sys/fs/cgroup/:/sys/fs/cgroup:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --publish=8080:8080 \
  --detach=true \
  google/cadvisor
jbeda commented 10 years ago

@monnand Nice! We should try and make that work.

One thing I worry about is things like driving iptables rules. To solve #15 we'll have to be able to either muck with iptable rules or get a new networking mode into Docker proper.

brendanburns commented 10 years ago

There was just a discussion of this in the plumbers meeting. The union bay networks folks want the ability to muck with the network from a container too.

Brendan On Jun 11, 2014 4:33 PM, "Joe Beda" notifications@github.com wrote:

@monnand https://github.com/monnand Nice! We should try and make that work.

One thing I worry about is things like driving iptables rules. To solve

15 https://github.com/GoogleCloudPlatform/kubernetes/issues/15 we'll

have to be able to either muck with iptable rules or get a new networking mode into Docker proper.

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/19#issuecomment-45813777 .

vmarmol commented 10 years ago

Today you should be able to get the host's network. +1 to @brendanburns's comment.

proppy commented 10 years ago

yes --net host should do the trick.

Another interesting thing to do is -v /var/run/docker.sock:/var/run/docker.sock to access the docker daemon from the container. (or just having the docker daemon listen on localhost w/ --net host)

jbeda commented 10 years ago

Notes of work in progress:

This leaves us with 2 choices:

  1. Build in the boot2docker VM and copy results back out, either through stdout from the docker run or via boot2docker ssh
  2. Build the final docker images inside the boot2docker VM inside of a docker container. Yup, this means running docker in docker. This is supported with dind but it gets complicated. There is a repo (https://github.com/jpetazzo/dind) with support but...

Right now I'm leaning toward copying stuff in and out (option 1).

proppy commented 10 years ago

You could have Dockerfile for individual binaries and have the resulting container image launch it has its ENTRYPOINT.

That way you could leverage the fact sources are sent from your workstation to the VM as the /build payload (context) over the remote API (no need to copy to the host) and the docker image is your artefact. On Jun 14, 2014 8:48 AM, "Joe Beda" notifications@github.com wrote:

Notes of work in progress:

  • I'm starting out by moving our build process into Docker. A snapshot of a Dockerfile and a Makefile to automate some common stuff is here: jbeda@6c4a6a8 https://github.com/jbeda/kubernetes/commit/6c4a6a8fa7794874862cadaf31948bdb9235f51a
  • If we say everyone has to build on Linux, this is much easier. With the boot2docker on a Mac, you essentially have a remote machine that you are talking to through a local TCP pipe. That means with any -v local-path:container-path the local-path is really local to the boot2docker VM and not the local workstation.

This leaves us with 2 choices:

  1. Build in the boot2docker VM and copy results back out, either through stdout from the docker run or via boot2docker ssh
  2. Build the final docker images inside the boot2docker VM inside of a docker container. Yup, this means running docker in docker. This is supported with dind but it gets complicated. There is a repo ( https://github.com/jpetazzo/dind) with support but...

Right now I'm leaning toward copying stuff in and out (option 1).

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/19#issuecomment-46091452 .

proppy commented 10 years ago

Note of you just want to have a container to build the projects and get binaries out.

You can also set the ENTRYPOINT to the build command.

docker build ; docker run # to build docker cp # to get the file out of the container On Jun 14, 2014 10:00 AM, "Johan Euphrosine" proppy@google.com wrote:

You could have Dockerfile for individual binaries and have the resulting container image launch it has its ENTRYPOINT.

That way you could leverage the fact sources are sent from your workstation to the VM as the /build payload (context) over the remote API (no need to copy to the host) and the docker image is your artefact. On Jun 14, 2014 8:48 AM, "Joe Beda" notifications@github.com wrote:

Notes of work in progress:

  • I'm starting out by moving our build process into Docker. A snapshot of a Dockerfile and a Makefile to automate some common stuff is here: jbeda@6c4a6a8 https://github.com/jbeda/kubernetes/commit/6c4a6a8fa7794874862cadaf31948bdb9235f51a
  • If we say everyone has to build on Linux, this is much easier. With the boot2docker on a Mac, you essentially have a remote machine that you are talking to through a local TCP pipe. That means with any -v local-path:container-path the local-path is really local to the boot2docker VM and not the local workstation.

This leaves us with 2 choices:

  1. Build in the boot2docker VM and copy results back out, either through stdout from the docker run or via boot2docker ssh
  2. Build the final docker images inside the boot2docker VM inside of a docker container. Yup, this means running docker in docker. This is supported with dind but it gets complicated. There is a repo ( https://github.com/jpetazzo/dind) with support but...

Right now I'm leaning toward copying stuff in and out (option 1).

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/19#issuecomment-46091452 .

proppy commented 10 years ago

You could also have a combination of the two.

Build and run kube on top of google/golang for development; for production if the size of google/debian base bother you: rebase the binary on top of busybox image. On Jun 14, 2014 10:11 AM, "Johan Euphrosine" proppy@google.com wrote:

Note of you just want to have a container to build the projects and get binaries out.

You can also set the ENTRYPOINT to the build command.

docker build ; docker run # to build docker cp # to get the file out of the container On Jun 14, 2014 10:00 AM, "Johan Euphrosine" proppy@google.com wrote:

You could have Dockerfile for individual binaries and have the resulting container image launch it has its ENTRYPOINT.

That way you could leverage the fact sources are sent from your workstation to the VM as the /build payload (context) over the remote API (no need to copy to the host) and the docker image is your artefact. On Jun 14, 2014 8:48 AM, "Joe Beda" notifications@github.com wrote:

Notes of work in progress:

  • I'm starting out by moving our build process into Docker. A snapshot of a Dockerfile and a Makefile to automate some common stuff is here: jbeda@6c4a6a8 https://github.com/jbeda/kubernetes/commit/6c4a6a8fa7794874862cadaf31948bdb9235f51a
  • If we say everyone has to build on Linux, this is much easier. With the boot2docker on a Mac, you essentially have a remote machine that you are talking to through a local TCP pipe. That means with any -v local-path:container-path the local-path is really local to the boot2docker VM and not the local workstation.

This leaves us with 2 choices:

  1. Build in the boot2docker VM and copy results back out, either through stdout from the docker run or via boot2docker ssh
  2. Build the final docker images inside the boot2docker VM inside of a docker container. Yup, this means running docker in docker. This is supported with dind but it gets complicated. There is a repo ( https://github.com/jpetazzo/dind) with support but...

Right now I'm leaning toward copying stuff in and out (option 1).

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/19#issuecomment-46091452 .

jbeda commented 10 years ago

Thanks for the comments @proppy.

I want the resultant container image to be minimal. I like the idea of layering it on the busybox image.

That means that the image used to build should be different than the image used at runtime. Doing docker cp to copy things around is one thing I'm looking to avoid. dind is one solution there. Rebasing will require either dind or docker cp. If we don't do this carefully we end up packaging up 60+MB every time we build the image. That takes too long :)

proppy commented 10 years ago

FYI, I have a pending patch to docker that could provide an hacky alternative. https://github.com/dotcloud/docker/pull/5715

This would allow something like

docker build -t builder ; (docker run builder | docker build -t runner -)

On Jun 14, 2014 2:52 PM, "Joe Beda" notifications@github.com wrote:

Thanks for the comments @proppy https://github.com/proppy.

I want the resultant container image to be minimal. I like the idea of layering it on the busybox image.

That means that the image used to build should be different than the image used at runtime. Doing docker cp to copy things around is one thing I'm looking to avoid. dind is one solution there. Rebasing will require either dind or docker cp. If we don't do this carefully we end up packaging up 60+MB every time we build the image. That takes too long :)

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/19#issuecomment-46100695 .

saad-ali commented 9 years ago

CC: @saad-ali

bgrant0607 commented 9 years ago

See also #5011.

errordeveloper commented 8 years ago

Isn't this mostly done already?

bgrant0607 commented 8 years ago

Yes.

maicohjf commented 5 years ago

You could have Dockerfile for individual binaries and have the resulting container image launch it has its ENTRYPOINT.

That way you could leverage the fact sources are sent from your workstation to the VM as the /build payload (context) over the remote API (no need to copy to the host) and the docker image is your artefact. On Jun 14, 2014 8:48 AM, "Joe Beda" notifications@github.com wrote:

Notes of work in progress:

  • I'm starting out by moving our build process into Docker. A snapshot of a Dockerfile and a Makefile to automate some common stuff is here: jbeda/kubernetes@6c4a6a8 jbeda@6c4a6a8
  • If we say everyone has to build on Linux, this is much easier. With the boot2docker on a Mac, you essentially have a remote machine that you are talking to through a local TCP pipe. That means with any -v local-path:container-path the local-path is really local to the boot2docker VM and not the local workstation.

This leaves us with 2 choices:

  1. Build in the boot2docker VM and copy results back out, either through stdout from the docker run or via boot2docker ssh
  2. Build the final docker images inside the boot2docker VM inside of a docker container. Yup, this means running docker in docker. This is supported with dind but it gets complicated. There is a repo ( https://github.com/jpetazzo/dind) with support but...

Right now I'm leaning toward copying stuff in and out (option 1). — Reply to this email directly or view it on GitHub #19 (comment) .

15. Check to see how many nodes are ready (not including nodes tainted NoSchedule) and write the number to /opt/......

首先用kubectl get node 查看总共node 数量,然后kubectl describe node 查看 不包含污点值为NoSchedule 的node 数量

  1. From the Pod label name=cpu-utilizer, find pods running high CPU workloads and write the name of the Pod consuming most CPU to the file /opt/...... (which already exists)

方法一: 通过kubectl get pod -l name=cpu-utilizer -o wide 查看pod 所在的node 之后kubectl describe node 查看哪个pod 请求的cpu 最多 三个都是0,我就都写进去了 方法二: kubectl top pod -l name=cpu-utilizer