drone / drone-runtime

[DEPRECATED] migrated to https://github.com/drone-runners
Other
62 stars 43 forks source link

kubernetes: allow skipping namespace creation #53

Closed MOZGIII closed 5 years ago

MOZGIII commented 5 years ago

I'd like kubernetes drone deployment to use a single namespace for all the workloads.

For me it makes more sense than allowing drone a full-cluser admin control.

Frankly, I expected drone to work like this when I was setting it up, and I ended up with the following rbac config:

apiVersion: v1
kind: Namespace
metadata:
  name: drone-workloads
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: drone-ci-runner
  namespace: drone-workloads
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch", "extensions"]
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: drone-ci-runner-binding
  namespace: drone-workloads
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: drone-ci-runner
subjects:
  - kind: ServiceAccount
    name: default
    namespace: drone

Here drone server is to live at drone namespace, and to run it's worloads in drone-workloads namespace.

For some reason, drone attempted to create some namespace, and build failed with the following:

default: namespaces is forbidden: User "system:serviceaccount:drone-workloads:default" cannot create resource "namespaces" in API group "" at the cluster scope

Actually, as I think now, it might be wanting to create the very drone-workloads namespace. Or do something like that, I dunno. In other words, I'm stuck, please help.

If it's trying to create namespace - it's a good idea to add a configuration value to just skip it, for cases like mine where I don't want to give cluster admin to the drone.

bradrydzewski commented 5 years ago

a single namespace is problematic because it can result in service-name collision when using service containers. For example service container redis is accessible to a pipeline as tcp://redis. This would not be possible in a shared namespace without risk of hostname collision and/or exposing resources to the wrong pipelines.

MOZGIII commented 5 years ago

I see, but multiple namespaces are problematic too, because of the level of privilidges required for the operation. Plus it seems much better if all the drone workloads that are isolated in a single namespace, rather than scattered across multiple namespaces. That said, I'm a rookie in this k8s stuff, and I might not have enough experience to figure this one out.

MOZGIII commented 5 years ago

Seems like we need some way to define namespaces within the namespaces to make this work perfectly...

MOZGIII commented 5 years ago

Technically, you can still go across namespaces if you want to btw. You'd use tcp://redis.namepsace, like tcp://redis.production. That is unless Network Policies are used.

bradrydzewski commented 5 years ago

there are no plans to change the default behavior, however @metalmatze has also expressed interest in adding a flag to use a static namespace. He and I briefly discussed the issue of service name collisions and this is something he is considering. If a workable solution for services in a single namespace can be found, we will be able to add the flag.

MOZGIII commented 5 years ago

I'd like to take a look if that's in the open somewhere, could you post a link here?

metalmatze commented 5 years ago

I have not started working on it. I looked at the Kubernetes parts of this runtime and it shouldn't have hard to guard the namespacing behind the flags. However, I'm quite busy at the moment and can't promise anything. Drone is running in my cluster but is currently not usable due to the same problems described above and I would really like to finally use it. So expect the change to come from my side, if nobody does it earlier.

MOZGIII commented 5 years ago

I was thinking about the issue with the namespaces and conflicting service DNS lately. My understanding is, even though it would be possible for drone to run stuff atop a single namespace, it won't be optimal and there will be issues on edge cases that we better solve from the get-go. This means I'm for separate namespaces, however I'd like to restrict drone's service account very carefully to not let it do naughty stuff to the parts of the cluster it should touch.

As a recap, think I looked into in the design for single-namespace feature:

Given all that, and the fact that the decision that the default of having multiple namespaces is there to stay, I thing we either way have to solve the problem of too broad permissions required to drone to function in that mode. What I think we need there is to make drone only use namespaces with certain prefixes. Give it the ability to create and delete namespaces under the specified prefix, and to manage certain resources in those prefixes (i.e. jobs, endpoints, etc). I'm not sure though if that can be done with the current rbac, and what resources drone needs to have access to.

Overall, I can to conclusion that k8s support can be "basic" - to match basic semantics of running the pipeline and services, or it can be "extended" - in a sense that drone could allow users to create additional cluster resources as part of the pipeline - for example some CRDs, if they need them for whatever reason. I think even service spawning (i.e. handing of a service directive at .drone.yml) creation can be made modular, and implemented as a plugin.

I love how drone is extendable all the way - maybe we could actually use a plugin system for service management too, @bradrydzewski?

schmitch commented 5 years ago

btw. I'm at the moment evaluating to migrate from gitlab to drone + gitea, however this is a big issue for us. gitlab-runner works by using Environment Variables for services, thus you can easily connect to any Postgres running for job XYZ, however we will never allow to give cluster admin rights to drone, just that the service has a name like 'postgres'.

schmitch commented 5 years ago

btw. its probably easier/better for drone to use https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config i.e. have a built-in dns server in the drone runtime which actually creates postgres.ci-1234.drone.local

i.e. drone sets the built-in drone dns + the k8s dns servers. or just use environment varialbes.

MOZGIII commented 5 years ago

Can we create an endpoint/service/pod with a dot in a name? If so it should be just good enough - combined with the custom searches. If that's not possbile - we might want to implement some support for creating a custom in-cluster DNS records at CoreDNS.

MOZGIII commented 5 years ago

Just found out about hostname and subdomain pod options: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields

MOZGIII commented 5 years ago

Ok, so, if we use the drone build (drone-123 for example) for pod hostnames and drone.yml service names as subdomains (for example postgres), and add the drone-123.my-namespace.svc.cluster.local to pods searches list - we'll effectively solve the naming issue without doing separate namespaces.

This way, when the pod resolves postgres it will actually resolve postgres.drone-123.my-namespace.svc.cluster.local - and that would be exactly the name of the pod with hostname drone-123 and subdomain postgres.

I can't test this atm, would be great if somebody could confirm that this setup works in practice.

bradrydzewski commented 5 years ago

I am increasingly convinced that we are going down the wrong path here. The native Kubernetes runtime was a very valuable experiment and we have learned quite a bit, but I think we need to adjust. Drone and the Drone Yaml were designed for Docker and it feels like we are trying to abstract and force fit Drone for Docker into Drone for Kubernetes, and I feel like it is just going to leave everyone unsatisfied.

I believe we should instead create a separate yaml definition and runtime for Kubernetes (similar to what we are planning for exec pipelines [1], aws pipelines [2], vmware fusion pipelines [3], etc). This could even be a simple wrapper around Knative or Tekton ...

kind: pipeline
type: knative

steps:
- name: echo
  image: ubuntu
  command:
  - echo
  args:
  - "hello world"

The yaml file was specifically designed to allow us to define different yaml structures based on the pipeline type. This allows us to create a yaml structure that is specific to the target system, instead of trying to force-fit everything into a one-size-fits-all structure

kind: pipeline
type: docker|knative|tekton|exec|chroot|qemu

[1] https://github.com/drone/drone-runtime/issues/46 [2] https://github.com/drone/drone-runtime/issues/49 [3] https://github.com/drone/drone-runtime/issues/15

iron-sam commented 5 years ago

We are worried about the same issue with too broad permissions for pipeline's jobs.

In the meantime, there's any restricted RBAC policy that we can create to allow pipelines creates their own namespaces, without binding cluster-admin role?

Thanks!

jeanlucmongrain commented 5 years ago

I don't mind dynamically created namespaces, but I really wish we could have a KUBERNETES_NAMESPACE environment variable to troubleshoot instead of kubectl get namespaces and trying to find which one a job belong to

arianitu commented 5 years ago

Anybody have the most restricted Role/ClusterRole that you can use with Drone? We're in a similar position as well, worried about giving cluster-admin to Drone.

We may have to create an entirely separate cluster to run Drone so we don't have a potential security hole coming from the CI.

arianitu commented 5 years ago

I've been using below so far, if anyone has suggestions for further restrictions, that would be good:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: drone
rules:

- apiGroups:
  - batch
  - extensions
  resources:
  - jobs

  verbs: 
  - get
  - list
  - watch
  - create
  - update
  - patch
  - delete

- apiGroups:
  - apps
  - extensions
  resources:
  - deployments
  verbs:
  - get
  - list
  - watch
  - patch
  - update

- apiGroups:
  - ""
  resources:
  - namespaces
  - configmaps
  - secrets
  - pods
  - services
  verbs:
  - create
  - delete
  - get
  - list
  - watch

- apiGroups:
  - ""
  resources:
  - pods/log  
  verbs:
  - get
jeanlucmongrain commented 5 years ago

look at the roles in the helm chart like that one

MOZGIII commented 5 years ago

You guys know this is still ClusterRole in both listing above and the link (at helm chart), right? Not really much more secure than just giving drone cluster admin... The logical way to step up the security policy significantly would be to allow drone to take action on only certain namespaces - or just make it run everything in a single namespace - but the latter is not supported, and per-namespace restrictions can be implemented only with some extra software piece - something like a custom admission controller, etc.

bradrydzewski commented 5 years ago

the kubernetes implementation in this repository was scrapped for reasons described here. We have a new implementation, created from scratch, that no longer creates per-pipeline namespaces. It can be found here: https://github.com/drone-runners/drone-runner-kube

bradrydzewski commented 5 years ago

first draft of new kubernetes runner documentation can be found here: https://kube-runner.docs.drone.io/