Open peterlamar opened 9 years ago
Did you mean CloudProvider (OpenStack) integration?
Sure, updated
You are referring to providing a public IP address to the kubernetes services themselves?
Indeed, let me know if you have other ideas
Maybe, we should try to adopt OpenStack Magnum It's a going to become a native approach in OpenStack to provide containers to the cloud users. Here is a video from the latest summit: https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/magnum-containers-as-a-service-for-openstack
If the skydns add-on is enabled then we would have to give skydns a real network block that is addressable. Without skydns, then flannel (or the networking layer for docker used) would need a real network block that is addressable.
With services as defined by Kubernetes, they are good for creating a "virtual name" for a pod, but even using LoadBalancer mode for the service you still get random port assignment. As such I had to leverage the (https://github.com/GoogleCloudPlatform/kubernetes/tree/v1.0.1/contrib/for-demos/proxy-to-service) approach to get a more constant port, that I could then provide the Openstack LoadBalancer such that then I had a public ip address.
We have several engineers at Cisco developing magnum we should sync with, this was actually suggested to me Friday.
Solving this will likely guide us to the networking solution we would like to use. We did Calico for MI but lets be open to others if there is a good reason.
Sounds good!
With OpenStack we will use Neutron in any case. But we can choose plugins for Neutron such as OVS, Calico, OpenDaylight, etc.
This keeps coming back up. Are there any creative solutions that do not integrate with Openstack? We can keep our Openstack integration efforts going, but they will take awhile regardless.
@ldejager Suggested to provide a sort of dynamic (dns) registration service ourselves. For example, user X spins up our k8s solution, upon completion the terraform.py posts the information that would usually go into /etc/hosts to the registration service and get’s back and prints out a unique resolvable DNS name for the instance, I.e. k8s-master.X.cs.co that points to the IP address of the master which they can then reach.
I'm not sure but this sounds like a list of goals:
1. dns / service discovery, which may be satisfied via the k8s dns addon, and adding the dns resolver for the k8s cluster into the client side appl. 2. automatically exposing ports on an external host so there is a known endpoint address. 3. address resolution via network block mgmt
1) can be tested from the implementations available 2) exposing ports Assuming that (3) address resolution via network block mgmt is working via either bridge[flat cider space] or a flannel implementation the problem is the same for exposing ports transparently.
Currently there's an effort to add transparent proxying: use iptables for proxying instead of userspace #3760, and there's a contribute option to enable a bare metal solution for load balancing w/o a provider specific solution, recently moved service load-balancer . This approach claims it will support cross cluster load balancing.
Let's track the iptables solution, and test the load balancing solution in our environment.
An alternative implementation could be to write our own monitor that injects port forwarding to services, effectively using the default load balancing a normal service provides but forwarding the port from the public machine to the internal k8s service's ip:port. Either etcd or kube event discovery could be monitored to manage the injection and creation of the port forwarding and the default round robin load balancing will apply from the k8s service.
3) A flat address block with flannel traffic is generally NAT but we can invert the bridge to flatten the visibility of the containers. Flannel might look like this:
For this example assume that flannel is a private subnet of 172.24.0.0/16 managed by flannel & k8s nodes: node0: 10.1.12.10 node1: 10.1.12.11 node2: 10.1.12.12 run etcd on node0
cloudinit configuration for etcd2
#cloud-config
coreos:
etcd2:
advertise-client-urls: http://10.1.12.10:2379
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://10.1.12.10:2380
proxy: off
units:
- name: etcd2.service
command: restart
enable: true
flannel drop in for cloudinit [systemd]
drop-ins:
- name: 50-network-config.conf
content: |
[Service]
ExecStartPre=-/usr/bin/etcdctl --peers=10.1.12.10:2379 set /coreos.com/network/config '{ "Network": "172.24.0.0/16" }'
[Install]
WantedBy=multi-user.target
$ for node in $(etcdctl ls --recursive /coreos.com/network); do echo ${node} $(etcdctl get ${node} 2>/dev/null); done
/coreos.com/network/config { "Network": "172.24.0.0/16" }
/coreos.com/network/subnets
/coreos.com/network/subnets/172.24.5.0-24 {"PublicIP":"10.1.12.11"}
/coreos.com/network/subnets/172.24.92.0-24 {"PublicIP":"10.1.12.10"}
/coreos.com/network/subnets/172.24.95.0-24 {"PublicIP":"10.1.12.12"}
These are directly routeable via the host routing table
$ ip r
default via 10.1.12.1 dev eth0 proto dhcp src 10.1.12.10 metric 1024
10.1.12.0/24 dev eth0 proto kernel scope link src 10.1.12.10
10.1.12.1 dev eth0 proto dhcp scope link src 10.1.12.10 metric 1024
172.24.0.0/16 dev flannel0 proto kernel scope link src 172.24.92.0
172.24.92.0/24 dev docker0 proto kernel scope link src 172.24.92.1
So flannel fixes container to container and host to containers connectivity via ip.
Bridge without overlay fabric might look like this:
The bridge can be a member of a CIDR block shared by all cluster members, for now let's say a /16 address space. This CIDR block is transparent without NAT to all other cluster members including all of the member's containers.
bridge0
|
bond0
/ \
eth0 eth1
In cloudinit format, using 10.10.0.0/16 with a gateway of 10.10.0.1 and this host assigned to 10.10.0.2/24, and docker bridge configured with 10.10.2.0/24 the configuration might look like the following:
#cloud-config
coreos:
units:
- name: 10.static.netdev
command: start
content: |
[NetDev]
Name=bridge0
Kind=bridge
- name: 20.static.network
command: start
content: |
[Match]
Name=bridge0
[Network]
Address=10.10.0.2/16
DNS=...
DNS=...
Gateway=...
IPForward=yes
- name: 50.static.network
command: start
content: |
[Match]
Name=eth0
[Network]
Bridge=bridge0
- name: 60.static.network
command: start
content: |
[Match]
Name=eth1
[Network]
Bond=bond0
Then configure docker to use this same subnet:
- name: docker.service
command: start
enable: true
content: |
[Unit]
After=docker.socket
Description=Docker Application Container Engine
Documentation=http://docs.docker.io
[Service]
Restart=always
Environment="DOCKER_OPT_BIP=-b=bridge0"
Environment="DOCKER_OPT_MTU="
Environment="DOCKER_OPT_CIDR=--fixed-cidr=10.10.2.1/24"
Environment="DOCKER_OPTS=--host=unix:///var/run/docker.sock"
ExecStart=/bin/bash -c "/usr/lib/coreos/dockerd \
--daemon \
${DOCKER_OPTS} \
${DOCKER_OPT_BIP} \
${DOCKER_OPT_CIDR} \
${DOCKER_OPT_MTU} \
${DOCKER_OPT_IPMASQ} \
"
[Install]
WantedBy=multi-user.target
A sample implementation without auto discovery forcing the service address to match the port .88:8888 using a trivial forwarder and flannel network overlay.
default sfs-svc k8s-app=sfs-svc k8s-app=sfs-rc 172.24.254.88 8888/TCP
With:
kubectl scale --replicas=1 rc/sfs-rc
for (( i=0; i<4; i++ )); do printf "%3d %s\n" "${i}" "$(curl --silent 10.1.12.10:8888|grep host)"; done
0 <a href="host-coreos-alpha-00">host-coreos-alpha-00</a>
1 <a href="host-coreos-alpha-00">host-coreos-alpha-00</a>
2 <a href="host-coreos-alpha-00">host-coreos-alpha-00</a>
3 <a href="host-coreos-alpha-00">host-coreos-alpha-00</a>
With:
kubectl scale --replicas=3 rc/sfs-rc
for (( i=0; i<3; i++ )); do printf "%3d %s\n" "${i}" "$(curl --silent 10.1.12.10:8888|grep host)"; done
0 <a href="host-coreos-alpha-03">host-coreos-alpha-03</a>
1 <a href="host-coreos-alpha-02">host-coreos-alpha-02</a>
2 <a href="host-coreos-alpha-01">host-coreos-alpha-01</a>
And from a public ip with a running forwarder:
for (( i=0; i<3; i++ )); do printf "%3d %s\n" "${i}" "$(curl --silent 208.90.61.54:8888|grep host)"; done
0 <a href="host-coreos-alpha-03">host-coreos-alpha-03</a>
1 <a href="host-coreos-alpha-02">host-coreos-alpha-02</a>
2 <a href="host-coreos-alpha-01">host-coreos-alpha-01</a>
The systemd service file
# /etc/systemd/system/forward-sfs.service
[Unit]
Description=%N port forward to k8s service on fixed flannel subnet address 172.24.254.88:8888
After=flanneld.service
Requires=flanneld.service
[Service]
# 172.24.254.88 is route-able from the flannel members
# /coreos.com/network/config { "Network": "172.24.0.0/16" }
ExecStart=/var/lib/ecmi/forward 10.1.12.10:8888 172.24.254.88:8888
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
The cluster state after the rescale looks like this:
NAME LABELS STATUS
coreos-alpha-00 kubernetes.io/hostname=coreos-alpha-00 Ready
coreos-alpha-01 kubernetes.io/hostname=coreos-alpha-01 Ready
coreos-alpha-02 kubernetes.io/hostname=coreos-alpha-02 Ready
coreos-alpha-03 kubernetes.io/hostname=coreos-alpha-03 Ready
NAMESPACE CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS
default sfs-rc sfs simple-file-server k8s-app=sfs-rc,version=v1 3
kube-system kube-ui-v1 kube-ui gcr.io/google_containers/kube-ui:v1.1 k8s-app=kube-ui,version=v1 1
NAMESPACE NAME READY STATUS RESTARTS AGE NODE
default sfs-rc-n4dvj 1/1 Running 0 5m coreos-alpha-03
default sfs-rc-o155f 1/1 Running 0 6m coreos-alpha-01
default sfs-rc-yvmz1 1/1 Running 0 5m coreos-alpha-02
kube-system kube-ui-v1-mlv2r 1/1 Running 0 8m coreos-alpha-00
NAMESPACE NAME LABELS SELECTOR IP(S) PORT(S)
default kubernetes component=apiserver,provider=kubernetes <none> 172.24.254.1 443/TCP
default sfs-svc k8s-app=sfs-svc k8s-app=sfs-rc 172.24.254.88 8888/TCP
kube-system kube-ui k8s-app=kube-ui,kubernetes.io/cluster-service=true,kubernetes.io/name=KubeUI k8s-app=kube-ui 172.24.254.19 80/TCP
NAMESPACE NAME ENDPOINTS
default kubernetes 10.1.12.10:6443
default sfs-svc 172.24.40.2:8888,172.24.49.5:8888,172.24.93.2:8888
kube-system kube-ui 172.24.53.5:8080
And a single static endpoint was used in the service:
apiVersion: v1
kind: Service
metadata:
name: sfs-svc
labels:
k8s-app: sfs-svc
spec:
selector:
k8s-app: sfs-rc
ports:
- port: 8888
clusterIP: 172.24.254.88
The sfs is a simple file server and there's a file on each host in the directory served with the name of the host prefaced by host, e.g. host-coreos-alpha-01
Edit After clearing the cluster and restarting with dns, the cluster config appears to be working with full dns discovery
core@coreos-alpha-00 ~ $ kubectl exec -it busybox -- cat /etc/resolv.conf
nameserver 172.24.254.53
nameserver 10.1.12.1
search default.svc.k8s.local svc.k8s.local k8s.local novalocal
core@coreos-alpha-00 ~ $ kubectl exec -it busybox -- nslookup sfs-svc
Server: 172.24.254.53
Address 1: 172.24.254.53
Name: sfs-svc
Address 1: 172.24.254.88
Screen grab from UI with partial list of guestbook cluster members
Screenshot of Guestbook . . .
A simplified replication controller using internal load balancing from kube proxy and simple port forwarding
apiVersion: v1
kind: ReplicationController
metadata:
name: service-sfs-loadbalancer
labels:
app: service-sfs-loadbalancer
version: v1
spec:
replicas: 1
selector:
app: service-sfs-loadbalancer
version: v1
template:
metadata:
labels:
app: service-sfs-loadbalancer
version: v1
spec:
nodeSelector:
role: master
containers:
- image: simple-forwarder:latest
imagePullPolicy: IfNotPresent
name: simple-forwarder
ports:
- containerPort: 8888
hostPort: 8888
resources: {}
privileged: true
args:
- "/simple-forwarder"
- "0.0.0.0:8888"
- "172.24.254.88:8888"
simple-forwarder build script
#!/bin/bash
version=0.1
dir=$(dirname $(readlink -f ${0}))
cd ${dir}
cat > Dockerfile <<EOF
FROM centos:latest
COPY simple-forwarder /simple-forwarder
CMD [ "/simple-forwarder" ]
EOF
if docker build --force-rm --rm --tag=simple-forwarder . ; then
docker tag simple-forwarder:latest simple-forwarder:${version}
fi
simple-forwarder.go proof of concept
// simple-forwarder.go
package main
import (
"io"
"log"
"net"
"os"
)
func forward(connection net.Conn) {
client, err := net.Dial("tcp", os.Args[2])
if err != nil {
log.Fatalf("Connection failed: %v", err)
}
log.Printf("Connected to localhost %v %v\n", connection.LocalAddr(), connection.RemoteAddr() )
go func() {
defer client.Close()
defer connection.Close()
io.Copy(client, connection)
}()
go func() {
defer client.Close()
defer connection.Close()
io.Copy(connection, client)
}()
}
func main() {
if len(os.Args) != 3 {
log.Fatalf("Usage %s frontend-ip:port backend-ip:port\n", os.Args[0]);
return
}
listener, err := net.Listen("tcp", os.Args[1])
if err != nil {
log.Fatalf("net.Listen(\"tcp\", %s ) failed: %v", os.Args[1], err )
}
for {
connection, err := listener.Accept()
if err != nil {
log.Fatalf("ERROR: failed to accept listener: %v", err)
}
log.Printf("Accepted connection %v %v\n", connection.LocalAddr(), connection.RemoteAddr() )
go forward(connection)
}
}
guestbook front end
The prior example with sfs uses a fixed ip address and hardcoded ip from the service pool to map the public address [ which is assigned to the master node for this example ] to the backend guestbook.
The mapping of an ip:port pair is dynamically managed via environment variables.
This introduces k8s a creation sequence dependency. The environment variable for the GUESTBOOK_PORT_3000_TCP_ADDR and GUESTBOOK_SERVICE_PORT aren't available until after the object is created.
kubectl create -f guestbook-fe.yaml
# guestbook-fe.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: guestbook-fe
labels:
app: guestbook-fe
version: v1
spec:
replicas: 1
selector:
app: guestbook-fe
version: v1
template:
metadata:
labels:
app: guestbook-fe
version: v1
spec:
nodeSelector:
role: master
containers:
- image: simple-forwarder:latest
imagePullPolicy: IfNotPresent
name: simple-forwarder
ports:
- containerPort: 3000
hostPort: 3000
resources: {}
privileged: true
args:
- "/bin/bash"
- "-c"
- "/simple-forwarder 0.0.0.0:3000 ${GUESTBOOK_PORT_3000_TCP_ADDR}:${GUESTBOOK_SERVICE_PORT}"
# local variables:
# comment-start: "# "
# mode: shell-script
# end:
kubectl exec busybox -- nslookup guestbook.default
Server: 172.24.254.53
Address 1: 172.24.254.53
Name: guestbook.default
Address 1: 172.24.254.30
Alternatively with kube-dns to remove the sequence dependency from the process, the args section could be replaced by the dns reference to the service.
args:
- "/simple-forwarder"
- "0.0.0.0:3000"
- "guestbook:3000"
Alex pointed out the proxy method from google in the kubernetes repo.
The gcr container uses socat, it appears from the log
kubectl logs guestbook-fe-pxy-svc-037q1
Running socat TCP-LISTEN:3000,reuseaddr,fork TCP:guestbook.default:3000
Kubernetes exampled reverse proxy for dns
The corresponding yaml for replication controller assuming again that the node with role=master has the public ip address creation for the guestbook might look like. Notice that this would depend on kube-dns or similar functionality active on the cluster, because it references the guestbook.default service dns reference.
# based on
# https://github.com/kubernetes/contrib/tree/master/for-demos/proxy-to-service
apiVersion: v1
kind: ReplicationController
metadata:
name: guestbook-fe-pxy-svc
labels:
app: guestbook-fe-pxy-svc
version: v1
spec:
replicas: 1
selector:
app: guestbook-fe-pxy-svc
version: v1
template:
metadata:
labels:
app: guestbook-fe-pxy-svc
version: v1
spec:
nodeSelector:
role: master
containers:
- name: guestboot-fe-pxy-svc-tcp
image: gcr.io/google_containers/proxy-to-service:v2
imagePullPolicy: IfNotPresent
args: [ "tcp", "3000", "guestbook.default" ]
ports:
- name: tcp
protocol: TCP
containerPort: 3000
hostPort: 3000
As a tenant, I can assign an IP automatically to services on CloudProvider (OpenStack) via cmd line so that services can easily be made external
Currently tenants must modify their /etc/hosts file or do other hacky workarounds to reach the guestbook example in Kubernetes when running outside of Google App Engine. It would be great to automate this and create a better user experience.