jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
21.05k stars 2.22k forks source link

Dynamic k8s namespace for generating kubernetes yaml #5994

Closed sansmoraxz closed 4 months ago

sansmoraxz commented 1 year ago

Describe the feature The current implementation for to_kubernetes_yaml flow takes k8s_namespace to be explicitly specified somewhere otherwise it outputs as default namespace. The namespace need not be explicitly set and can be dynamically injected through metadata, hence improving the reusability of the generated templates.

For example:

kubectl apply -f "file.yaml" --namespace=prod
kubectl apply -f "file.yaml" --namespace=dev

Your proposal

We may accept that the Optional field can be None and propagate that when generating the yaml, in which case the generated yaml will include this for env injection:

        - name: K8S_NAMESPACE_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace

And of course we may remove the injection of namespace for all the generated objects in this case.

More info from official docs

JoanFM commented 1 year ago

Hey @sansmoraxz,

This looks like a great feature. Since I see thwat you have digged in the code a bit, would you be up to implement it?

sansmoraxz commented 1 year ago

Sure

JoanFM commented 1 year ago

Please get in touch if you need any help, we will be welcoming your PR.

sansmoraxz commented 1 year ago

For deployment address (viz. injected to gateway) I see that services are being referred to by their full DNS name. Is it ok to open a sh shell, to invoke the jina cli? That way we can refer to the injected env K8S_NAMESPACE_NAME to refer to the executors.

Alternatively we can use with just the service name rather than the full DNS name of the service.

JoanFM commented 1 year ago

The namespace argument is propagated in the call to the to_kubernetes_yaml to every Deplpyment in the Flow, including the gateway one.

I am not sure I fully understand the question though

sansmoraxz commented 1 year ago

I meant assume gateway has the following generated command invoked:

containers:
      - args:
        - gateway
        - --k8s-namespace
        - default
        - --expose-endpoints
        - '{}'
        - --host
        - '1'
        - --uses
        - GRPCGateway
        - --graph-description
        - '{"executor0": ["executor1"], "start-gateway": ["executor0"], "executor1":
          ["end-gateway"]}'
        - --deployments-addresses
        - '{"executor0": ["grpc://executor0.default.svc:8080"], "executor1": ["grpc://executor1.default.svc:8080"]}'
        - --port
        - '8080'
        - --port-monitoring
        - '9090'
        command:
        - jina

The deployment addresses are not really dynamic

sansmoraxz commented 1 year ago

We can probably try to use something like this:

command:
        - /bin/bash
        - -c
        - jina
        - gateway
        - --k8s-namespace
        - $K8S_NAMESPACE_NAME

We invoke the bash first because directly passing it to the args doesn't seem to expand to the env value.

Same with the deployments-addresses. Although passing it the full service name is not really required.

This works just as well

'{"executor0": ["grpc://executor0:8080"], "executor1": ["grpc://executor1:8080"]}'
deepankarm commented 1 year ago

Although passing it the full service name is not really required. This works just as well

@sansmoraxz you're right. The complete FQDN is not required in deployment addresses, as both gateway & the executors are in the same namespace. We rely on service meshes (like Linkerd) that recommend using FQDNs rather than shorter names to avoid ambiguity during routing, hence we follow the best practices.

JoanFM commented 1 year ago

I see the problem of passing the namespace by env var to the entrypoint. Perhaps passing with ENV var syntax may work but I am not sure.

Otherwise I will take a look after holiday, thanks for your deep investigation.

sansmoraxz commented 1 year ago

Alternatively the jina cli can be told to look for specific environment variable names (and files) and we don't need to override the command explicitly.

sansmoraxz commented 1 year ago

For instance, look for the presence of /var/run/secrets/kubernetes.io or .dockerenv, to detect appropriate environments at startup. Since those are fixed name environment variable they can be directly passed to be loaded. This will also make the generated manifests a lot cleaner.

JoanFM commented 1 year ago

I will try to look at it in more detail this incoming week and discuss some proposals about it

JoanFM commented 1 year ago

Hey @sansmoraxz .

I will take a look at this soon.

Thanks for the patience

JoanFM commented 1 year ago

Hey @sansmoraxz,

What do u think about this solution:

I believe this should enable this feature, what do you say?

Actually, I think this is exactly what you have been meaning all this time right? Here I just summarized all in the same message?

sansmoraxz commented 1 year ago

My concern was that args didn't resolve for env.

Testing this small code on kubernetes 1.27:

apiVersion: v1
kind: Pod
metadata:
  name: command-demo
  labels:
    purpose: demonstrate-command
spec:
  containers:
  - name: command-demo-container
    image: alpine
    command: ["echo"]
    args: ["$ENV"]
    env:
    - name: ENV
      value: "Hello World!"
  restartPolicy: OnFailure

The output is $ENV

JoanFM commented 1 year ago

My concern was that args didn't resolve for env.

Testing this small code on kubernetes 1.27:

apiVersion: v1
kind: Pod
metadata:
  name: command-demo
  labels:
    purpose: demonstrate-command
spec:
  containers:
  - name: command-demo-container
    image: alpine
    command: ["echo"]
    args: ["$ENV"]
    env:
    - name: ENV
      value: "Hello World!"
  restartPolicy: OnFailure

The output is $ENV

Looking at this issue:

https://github.com/kubernetes/kubernetes/issues/57726

It seems that using downards API as u suggest should work however

sansmoraxz commented 1 year ago

OK so wrapping in $() does the trick. Interesting! this does tend to generally imply nested shell execution.

JoanFM commented 1 year ago

OK so wrapping in $() does the trick. Interesting! this does tend to generally imply nested shell execution.

I am not sure honestly

JoanFM commented 1 year ago

So then, we are good with this proposal?

sansmoraxz commented 1 year ago

Yes. One thing I was considering was using jinja to dynamically make the resources. We can offload the code templating to it, which in turn would make the code a lot cleaner. This may go under a future proposal. Too many changes to be made.

JoanFM commented 1 year ago

this can be another improvement, I agree, but I would not mix the actual feature with the nice refactoring

jina-bot commented 10 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

jina-bot commented 7 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

jina-bot commented 4 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

sansmoraxz commented 4 months ago

Forgot that I had this in my backlog.

There was some blocker on my side (which I don't remember properly). Give me a week to find what it was and see if I can resolve it. Otherwise you may assign this to someone else.