Open googs1025 opened 3 weeks ago
@danielvegamyhre @kannon92 @ahg-g Is this a required feature?
/assign
This is really a K8s request and not for Jobset.
IndexedJob should expose an environment variable for you to use. JOB_COMPLETION_INDEX
This is really a K8s request and not for Jobset.
IndexedJob should expose an environment variable for you to use. JOB_COMPLETION_INDEX
@kannon92 thanks for review! Yes, I know k8s already has the JOB_COMPLETION_INDEX environment variable. But the meanings of JOB_COMPLETION_INDEX and JOBSET_INDEX seem to be different.
root@VM-0-9-ubuntu:~/jobset/examples/simple# kubectl describe pods paralleljobs-workers-1-2-fd7d6
Name: paralleljobs-workers-1-2-fd7d6
Namespace: default
Priority: 0
Service Account: default
Node: cluster1-worker/172.18.0.2
Start Time: Tue, 04 Jun 2024 20:51:28 +0800
Labels: batch.kubernetes.io/controller-uid=62411fea-0401-42c1-b0d7-2b15d1abdc8f
batch.kubernetes.io/job-completion-index=2
batch.kubernetes.io/job-name=paralleljobs-workers-1
controller-uid=62411fea-0401-42c1-b0d7-2b15d1abdc8f
job-name=paralleljobs-workers-1
jobset.sigs.k8s.io/job-index=1
jobset.sigs.k8s.io/job-key=4e1c31554543f8219df068ce823cff3c77b9ec8c
jobset.sigs.k8s.io/jobset-name=paralleljobs
jobset.sigs.k8s.io/replicatedjob-name=workers
jobset.sigs.k8s.io/replicatedjob-replicas=3
jobset.sigs.k8s.io/restart-attempt=0
Annotations: batch.kubernetes.io/job-completion-index: 2
jobset.sigs.k8s.io/job-index: 1
jobset.sigs.k8s.io/job-key: 4e1c31554543f8219df068ce823cff3c77b9ec8c
jobset.sigs.k8s.io/jobset-name: paralleljobs
jobset.sigs.k8s.io/replicatedjob-name: workers
jobset.sigs.k8s.io/replicatedjob-replicas: 3
jobset.sigs.k8s.io/restart-attempt: 0
I'm not sure if I understand it correctly, please forgive me if I am wrong.
JOB_COMPLETION_INDEX : means that the Pods of a Job get an associated completion index from 0 to (.spec.completions - 1) Is the index of the job dimension JOBSET_INDEX: means the index of different jobs in ReplicatedJob. Is the index of the replicatedJob dimension
@googs1025 you can use the downward API to set an environment variable to the value of a label or annotation: https://kubernetes.io/docs/concepts/workloads/pods/downward-api/
downward API
@danielvegamyhre Yes, I know I can use the downward API. I am wondering if we should put the feature of injecting some information into the container into the jobset.
In other words, since there are many jobs or pods in a jobset, do we provide a global configuration capability?
apiVersion: jobset.x-k8s.io/v1alpha2
kind: JobSet
metadata:
name: jobset-example
spec:
# can configure global parameters,
# which will take effect in every job and pod after setting.
globalParams:
# can add the parameters required by the container
env:
- name: "FOO"
value: "bar"
- name: "QUE"
value: "pasa"
# job pod annotations
annotations:
key1: value1
key2: value2
# job pod labels
labels:
key1: value1
key2: value2
If you add labels/annotations to the job template I think they are sent to all downstream objects (job, pods). I don’t know if we do that for the service we create.
If you add labels/annotations to the job template I think they are sent to all downstream objects (job, pods). I don’t know if we do that for the service we create.
Yes, if we only look at the API field I have given, it is not a very good design, because many labels will be passed to each downstream workload. But I think it is necessary to have global configuration capabilities in jobset.
Can you describe a specific use case for this
Can you describe a specific use case for this
Currently, I haven't encountered a specific use case. I just think that in the collaboration between multiple jobs, there might be a need to share some information (using environment variables for transmission) or receive some information from higher-level components. That's why I raised the question of whether this feature is needed to support such scenarios.
Can you describe a specific use case for this
Currently, I haven't encountered a specific use case. I just think that in the collaboration between multiple jobs, there might be a need to share some information (using environment variables for transmission) or receive some information from higher-level components. That's why I raised the question of whether this feature is needed to support such scenarios.
It is an interesting idea, but making an API change and maintaining it indefinitely is a big commitment, and I only want to do that if there are specific use cases that require this.
/kind feature What would you like to be added: Currently, there are multiple labels in the jobset, and I think that some of these labels should be injected into the environment variables of the containers. For example, in a TensorFlow job, there is one parameter server (PS) and two workers. Each worker is responsible for processing a specific slice of the input data. To ensure that the workers are aware of the slices they are processing, they retrieve their respective indices through environment variables. I'm just providing an example or idea, and I'm not sure if this feature is needed.
like this:
Why is this needed: Enable the containers managed by the jobset to be aware of essential information.