flux-framework / flux-k8s

Project to manage Flux tasks needed to standardize kubernetes HPC scheduling interfaces
Apache License 2.0
20 stars 10 forks source link

First pass to re-org the repo #28

Closed ArangoGutierrez closed 1 year ago

ArangoGutierrez commented 2 years ago

This patch works as follows:

ArangoGutierrez commented 2 years ago

test Image: quay.io/eduardoarango/kubeflux:sidecar

ArangoGutierrez commented 2 years ago

Tested locally with a basic job

I0311 15:32:56.562959       1 round_trippers.go:460] Response Headers:
I0311 15:32:56.562969       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0311 15:32:56.562977       1 round_trippers.go:463]     Content-Type: application/json
I0311 15:32:56.562984       1 round_trippers.go:463]     X-Kubernetes-Pf-Flowschema-Uid: 1b0401e7-dc93-4d03-9c25-34d72a8ed06b
I0311 15:32:56.562991       1 round_trippers.go:463]     X-Kubernetes-Pf-Prioritylevel-Uid: 2dfb3c32-9d10-4a4d-97bc-efaae0f2bd26
I0311 15:32:56.563001       1 round_trippers.go:463]     Date: Fri, 11 Mar 2022 15:32:56 GMT
I0311 15:32:56.563008       1 round_trippers.go:463]     Audit-Id: c657bc10-37af-4ee4-a366-e3d358071cc5
I0311 15:33:20.681839       1 reflector.go:535] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.ReplicaSet total 0 items received
I0311 15:33:20.682038       1 round_trippers.go:435] curl -v -XGET  -H "Accept: application/vnd.kubernetes.protobuf, */*" -H "User-Agent: kube-scheduler/v0.0.0 (linux/amd64) kubernetes/$Format/scheduler" -H "Authorization: Bearer <masked>" 'https://10.96.0.1:443/apis/apps/v1/replicasets?allowWatchBookmarks=true&resourceVersion=152277&timeout=9m44s&timeoutSeconds=584&watch=true'
I0311 15:33:20.689952       1 round_trippers.go:454] GET https://10.96.0.1:443/apis/apps/v1/replicasets?allowWatchBookmarks=true&resourceVersion=152277&timeout=9m44s&timeoutSeconds=584&watch=true 200 OK in 7 milliseconds
I0311 15:33:20.689970       1 round_trippers.go:460] Response Headers:
I0311 15:33:20.689977       1 round_trippers.go:463]     Audit-Id: d88bb3af-7534-4f4b-9ee7-87993efc5948
I0311 15:33:20.689983       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0311 15:33:20.689988       1 round_trippers.go:463]     Content-Type: application/vnd.kubernetes.protobuf;stream=watch
I0311 15:33:20.689993       1 round_trippers.go:463]     X-Kubernetes-Pf-Flowschema-Uid: 1b0401e7-dc93-4d03-9c25-34d72a8ed06b
I0311 15:33:20.689998       1 round_trippers.go:463]     X-Kubernetes-Pf-Prioritylevel-Uid: 2dfb3c32-9d10-4a4d-97bc-efaae0f2bd26
I0311 15:33:20.690003       1 round_trippers.go:463]     Date: Fri, 11 Mar 2022 15:33:20 GMT
I0311 15:33:31.696350       1 reflector.go:535] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.Pod total 20 items received
I0311 15:33:31.696548       1 round_trippers.go:435] curl -v -XGET  -H "Accept: application/vnd.kubernetes.protobuf, */*" -H "User-Agent: kube-scheduler/v0.0.0 (linux/amd64) kubernetes/$Format/scheduler" -H "Authorization: Bearer <masked>" 'https://10.96.0.1:443/api/v1/pods?allowWatchBookmarks=true&fieldSelector=status.phase%21%3DSucceeded%2Cstatus.phase%21%3DFailed&resourceVersion=152778&timeout=6m31s&timeoutSeconds=391&watch=true'
I0311 15:33:31.699658       1 round_trippers.go:454] GET https://10.96.0.1:443/api/v1/pods?allowWatchBookmarks=true&fieldSelector=status.phase%21%3DSucceeded%2Cstatus.phase%21%3DFailed&resourceVersion=152778&timeout=6m31s&timeoutSeconds=391&watch=true 200 OK in 3 milliseconds
I0311 15:33:31.699673       1 round_trippers.go:460] Response Headers:
I0311 15:33:31.699682       1 round_trippers.go:463]     Content-Type: application/vnd.kubernetes.protobuf;stream=watch
I0311 15:33:31.699688       1 round_trippers.go:463]     X-Kubernetes-Pf-Flowschema-Uid: 1b0401e7-dc93-4d03-9c25-34d72a8ed06b
I0311 15:33:31.699693       1 round_trippers.go:463]     X-Kubernetes-Pf-Prioritylevel-Uid: 2dfb3c32-9d10-4a4d-97bc-efaae0f2bd26
I0311 15:33:31.699698       1 round_trippers.go:463]     Date: Fri, 11 Mar 2022 15:33:31 GMT
I0311 15:33:31.699718       1 round_trippers.go:463]     Audit-Id: ad8fa8b2-a480-4dca-b519-2401f7e2c4f4
I0311 15:33:31.699723       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0311 15:33:41.693672       1 reflector.go:535] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:205: Watch close - *v1.ConfigMap total 6 items received
I0311 15:33:41.693902       1 round_trippers.go:435] curl -v -XGET  -H "Authorization: Bearer <masked>" -H "Accept: application/json, */*" -H "User-Agent: kube-scheduler/v0.0.0 (linux/amd64) kubernetes/$Format" 'https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=152647&timeout=7m50s&timeoutSeconds=470&watch=true'
I0311 15:33:41.697303       1 round_trippers.go:454] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dextension-apiserver-authentication&resourceVersion=152647&timeout=7m50s&timeoutSeconds=470&watch=true 200 OK in 3 milliseconds
I0311 15:33:41.697318       1 round_trippers.go:460] Response Headers:
I0311 15:33:41.697328       1 round_trippers.go:463]     Audit-Id: b24917dc-47ac-40e1-902e-4a5edcfa4d02
I0311 15:33:41.697333       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0311 15:33:41.697338       1 round_trippers.go:463]     Content-Type: application/json
I0311 15:33:41.697343       1 round_trippers.go:463]     X-Kubernetes-Pf-Flowschema-Uid: 1b0401e7-dc93-4d03-9c25-34d72a8ed06b
I0311 15:33:41.697348       1 round_trippers.go:463]     X-Kubernetes-Pf-Prioritylevel-Uid: 2dfb3c32-9d10-4a4d-97bc-efaae0f2bd26
I0311 15:33:41.697353       1 round_trippers.go:463]     Date: Fri, 11 Mar 2022 15:33:41 GMT
I0311 15:34:12.292812       1 kubeflux.go:382] Delete Pod event handler
I0311 15:34:12.292843       1 kubeflux.go:385] Pod status: Pending
I0311 15:34:22.688422       1 reflector.go:535] k8s.io/client-go/informers/factory.go:134: Watch close - *v1.ReplicationController total 4 items received
I0311 15:34:22.688643       1 round_trippers.go:435] curl -v -XGET  -H "Accept: application/vnd.kubernetes.protobuf, */*" -H "User-Agent: kube-scheduler/v0.0.0 (linux/amd64) kubernetes/$Format/scheduler" -H "Authorization: Bearer <masked>" 'https://10.96.0.1:443/api/v1/replicationcontrollers?allowWatchBookmarks=true&resourceVersion=152687&timeout=6m21s&timeoutSeconds=381&watch=true'
I0311 15:34:22.694944       1 round_trippers.go:454] GET https://10.96.0.1:443/api/v1/replicationcontrollers?allowWatchBookmarks=true&resourceVersion=152687&timeout=6m21s&timeoutSeconds=381&watch=true 200 OK in 6 milliseconds
I0311 15:34:22.694952       1 round_trippers.go:460] Response Headers:
I0311 15:34:22.694956       1 round_trippers.go:463]     Audit-Id: a3f73a51-9753-46cd-94b3-51be7a0da0c7
I0311 15:34:22.694958       1 round_trippers.go:463]     Cache-Control: no-cache, private
I0311 15:34:22.694961       1 round_trippers.go:463]     Content-Type: application/vnd.kubernetes.protobuf;stream=watch
I0311 15:34:22.694963       1 round_trippers.go:463]     X-Kubernetes-Pf-Flowschema-Uid: 1b0401e7-dc93-4d03-9c25-34d72a8ed06b
I0311 15:34:22.694966       1 round_trippers.go:463]     X-Kubernetes-Pf-Prioritylevel-Uid: 2dfb3c32-9d10-4a4d-97bc-efaae0f2bd26
I0311 15:34:22.694969       1 round_trippers.go:463]     Date: Fri, 11 Mar 2022 15:34:22 GMT

Re-org doesn't break functionality

Ready for review @cmisale @milroy

Let's don't merge until both drop a OK TO ME message in this PR

cmisale commented 2 years ago

I don't see evidence of a pod being actually scheduled. Good that there is no segfault happening, which is a great start. Can you provide evidence that a pod is being scheduled and the logs from the sidecar?

ArangoGutierrez commented 2 years ago

sidecar logs

[eduardo@fedora-workstation scheduler-plugins]$ kubectl logs pod/kubeflux-7974d54554-n5xcq kubeflux-sidecar -n kube-system 
This is the fluxion grpc server
Created cli context  &{}
&{}
Number nodes  3
Node  minikube-m02  flux cpu  2
Node  minikube-m03  flux cpu  2
[GRPCServer] gRPC Listening on [::]:4242
ArangoGutierrez commented 2 years ago

Job logs

[eduardo@fedora-workstation flux-k8s]$ kubectl apply  -f  job.yaml
job.batch/pi-job-kubeflux-sched created
[eduardo@fedora-workstation flux-k8s]$ kubectl get job
NAME                    COMPLETIONS   DURATION   AGE
pi-job-kubeflux-sched   0/4           4s         4s
[eduardo@fedora-workstation flux-k8s]$ kubectl describe job/pi-job-kubeflux-sched
Name:             pi-job-kubeflux-sched
Namespace:        default
Selector:         controller-uid=18d0c9d7-c6e8-436b-9206-76791105ee7d
Labels:           app=pi-test-kubeflux
                  controller-uid=18d0c9d7-c6e8-436b-9206-76791105ee7d
                  job-name=pi-job-kubeflux-sched
Annotations:      batch.kubernetes.io/job-tracking: 
Parallelism:      1
Completions:      4
Completion Mode:  NonIndexed
Start Time:       Fri, 11 Mar 2022 10:44:56 -0500
Pods Statuses:    1 Active / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=pi-test-kubeflux
           controller-uid=18d0c9d7-c6e8-436b-9206-76791105ee7d
           job-name=pi-job-kubeflux-sched
  Containers:
   pi-test:
    Image:      quay.io/eduardoarango/pi:ubi8
    Port:       <none>
    Host Port:  <none>
    Limits:
      cpu:  2
    Requests:
      cpu:        2
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  20s   job-controller  Created pod: pi-job-kubeflux-sched-2c86x