Closed sjawhar closed 2 years ago
Can you provide a main.tf
you tested with/can be tested with/example?
Also thanks for the contribution!
Can you provide a main.tf you tested with/can be tested with/example?
First, create a persistent volume with ReadWriteMany. For example:
kind: PersistentVolume
apiVersion: v1
metadata:
name: pv-storage
labels:
type: local
spec:
capacity:
storage: 4Gi
accessModes:
- ReadWriteMany
hostPath:
path: /tpi-data
Then use this main.tf
terraform {
required_providers {
iterative = {
source = "github.com/iterative/iterative"
}
}
}
provider "iterative" {}
resource "iterative_task" "example" {
cloud = "k8s"
machine = "s"
disk_size = 1
parallelism = 3
indexed = false
storage {
workdir = "."
output = "results"
}
script = <<-END
#!/bin/bash
mkdir -p results
printenv > results/out-$JOB_COMPLETION_INDEX
END
}
related #585 /CC @0x2b3bfa0
Awesome feature, @sjawhar! heart_eyes It would be nice to enable
IndexedCompletion
automatically whenparallelism
is greater than1
instead of exposing a separate option that has no equivalent on other providers.
Thanks for the quick review! I also didn't like having an option that only applies to k8s, but I don't quite agree with your suggestion. Let me try to explain, though bear with me because I'm coming from a very cloud-heavy background and am relatively new to k8s.
K8s has two ways of running parallel jobs: indexed and nonindexed. Nonindexed means the jobs are identical, start up N, they do their thing. Indexed means that k8s expects a success for at least one job of each index 1-N. In other words, if job 3 fails, it'll keep retrying 3 specifically. They're useful for different things: you might use nonindexed to spawn N workers to accomplish M jobs by reading from a task queue, whereas you might use indexed to accomplish the same thing without the queue (e.g. by giving the full task list to each job, and then it assigns itself work based on its index).
tl;dr: both indexed and nonindexed are useful ways of running parallel jobs in k8s, and I don't think the user should be forced to use one or the other.
Now here's an extra twist: to accomplish the indexed example I described above (N workers for M jobs, where M > N), we need to be able to set .spec.completions
and .spec.parallelism
separately. So we actually need to add another k8s-specific option :sweat_smile:.
Thoughts on all this?
UPDATE: What if we removed indexed
and added completions
, and set CompletionMode to indexed if completions is set. That way there's only one k8s-specific option :smiley_cat:
if we could make index less k8s specific like somehow populate an env like TASK_COMPLETION_INDEX = [1-3]
basically reproducing the k8s JOB_COMPLETION_INDEX
behavior for the cloud instances ๐ค
I added the completions
functionality as described in my last comment. This just reflects what works for my use case, I'm happy to tear apart and refactor to make more generic, as you have both described. Also, please forgive my Go, this might be precisely the second time I've ever used it :sweat_smile:
For the use cases we[^1] have in mind, the following should be enough:
CompletionMode
equals to NonIndexed
if parallelism
is 1
else Indexed
Completions
equals to parallelism
Parallelism
equals to parallelism
The following scenarios are still not supported:
Out of curiosity, can you share a bit more of information on your use case? I guess that you may want to restrict the parallelism so running a massively parallel task doesn't exhaust your cluster's resources. ๐ค
[^1]: ๐ @iterative/cml for sanity
CompletionMode equals to NonIndexed if parallelism is 1 else Indexed
I think you still might not be grokking the purpose of non/indexed. They're both valid modes of parallel operation, it's not about parallelism = 1 or not.
Out of curiosity, can you share a bit more of information on your use case? I guess that you may want to restrict the parallelism so running a massively parallel task doesn't exhaust your cluster's resources. ๐ค
You got it. If I need to run 100 things, I don't want to spawn 100 pods. This seems to me like a necessary consideration for supporting on-prem/k8s.
As you can see from the diff, this is a very minor change that opens up a lot of functionality in k8s. I'd be grateful if you'd reconsider the pros/cons here. ๐
If I need to run 100 things, I don't want to spawn 100 pods
What you propose is similar to jobs.<job_id>.strategy.max-parallel
on GitHub Actions and could be useful, although supporting it outside k8s
is significantly more complex.
As per your suggestion, the current parallelism
argument should be replaced by two different arguments:
completions
in the Kubernetes parlance)parallelism
in the Kubernetes parlance)By default, 2 should be equal to 1 if not specified by users.
I think you still might not be grokking the purpose of non/indexed. They're both valid modes of parallel operation, it's not about parallelism = 1 or not.
Sorry, I explained myself badly. ๐
Traditionally, Kubernetes Jobs didn't have an Indexed
mode, and were meant to run exactly the same code one or more times. Using an external task queue was the officially recommended way of running different tasks.
This project tries to simplify the user's workflow as much as possible: the Indexed
mode eliminates the need for an external task queue in those cases when the total number of tasks is known in advance, but still allows users to differentiate every index.
My insistence of using NonIndexed
mode when parallelism
is 1
is just to support Kubernetes versions prior 1.24 if users don't need this feature. Requiring a single completion renders this feature useless, so we can safely disable it; when requiring more completions, it will be always enabled.
@iterative/cml
So, assuming:
CompletionsMode
can be inferred implicitly, and going with the latest suggestion.
How do we want to handle something like (user provides k8s specific field when not using k8s):
resource "iterative_task" "example" {
cloud = "aws"
...
parallelism = 3
completions = 5
}
Error: "completions not supported for non-k8s run-time"? (I am allergic to referring to k8s as a cloud provider ๐ ) We definitely don't have to support a task-queue like logic at this point IMO. We're stepping dangerously into "implementing scheduler" territory.
Considering the above, I have to point out the very likely fact that more and more k8s-specific switches will probably be requested and added in the future, and maybe some cloud-provider specific ones as well - some that it won't be natural to abstract away or support across providers neatly. My personal opinion is that we need to keep those as true as possible to the original behavior (=same name, some provider/run-time specific block? maybe nested under "advanced"?) so users can easily infer their behavior from the original docs without inventing ad-hoc abstractions for them that aren't natural. Thoughts on this?
introducing a k8s
-specific config block makes sense, esp. since it's not-a-cloud:
resource "iterative_task" "example" {
cloud = "k8s"
parallelism = ... # warn iff defined on `cloud = "k8s"`?
# k8s-specific block
k8s {
parallelism = 3 # in the k8s sense, not in the TPI sense
completions = 5
}
}
I'd avoid introducing backend-specific details, especially when the outer parallelism
is functionally equivalent to the inner completions
and the inner parallelism has no equivalent in other backends.
@sjawhar, would it be possible to keep the original scope of this pull request (enable indexed
mode when Kubernetes completions
> 1) and open a separate pull request for the addition of a separate parallelism limit?
@sjawhar, would it be possible to keep the original scope of this pull request (enable
indexed
mode when Kubernetescompletions
> 1) and open a separate pull request for the addition of a separate parallelism limit?
I really don't think that's a good idea. Users should not be forced to use indexed mode. I went that route initially just to make it work for my use case as quickly as possible, but this would be both a change from the current behavior and unnecessarily limiting. Are you sure you want to do that?
this would be both a change from the current behavior and unnecessarily limiting
I thought that enabling the indexed mode doesn't have any significant effect beyond exposing a couple of additional environment variables. Which kind of unnecessary limitations do you foresee?
I thought that enabling the indexed mode doesn't have any significant effect beyond exposing a couple of additional environment variables. Which kind of unnecessary limitations do you foresee?
It's not just about environment variables. From the docs:
- NonIndexed (default): the Job is considered complete when there have been
.spec.completions
successfully completed Pods. In other words, each Pod completion is homologous to each other.- Indexed: the Pods of a Job get an associated completion index from 0 to
.spec.completions-1
. The Job is considered complete when there is one successfully completed Pod for each index.
These are two very different behaviors. Your suggestion work perfectly well for my use case (Indexed is what I need), but it seems unnecessarily limiting. The spec already has provider-specific flags (e.g. region does not apply to k8s), so we're not breaking new ground by adding another one.
Nonetheless, if you're sure then I'll make the change.
Your suggestion work perfectly well for my use case (Indexed is what I need), but it seems unnecessarily limiting.
Indexed mode is precisely the kind of behavior this tool is meant to have; see #585. Unless there is any practical use case that requires the NonIndexed
mode, it looks like there is no pressing need to support both.
The spec already has provider-specific flags (e.g. region does not apply to k8s), so we're not breaking new ground by adding another one.
Ouch! ๐
Indeed, although there were plans to โrename region
to location
so it's valid for zones and node selectorsโ as per https://github.com/iterative/terraform-provider-iterative/issues/412#issue-1152159459
@0x2b3bfa0
Indexed mode is precisely the kind of behavior this tool is meant to have; see https://github.com/iterative/terraform-provider-iterative/issues/585. Unless there is any practical use case that requires the NonIndexed mode, it looks like there is no pressing need to support both.
I'm with @sjawhar on this one - it makes no sense taking away this choice for the perceived "simplicity" of merely saving a field here (1:1 with k8s field, explainability & docs built in). If we will get here the power of NonIndexed
for the case of k8s only - that's still valuable IMO. No need to abuse terminology or over generalize - this should be in a k8s-specific block imo (yes, exposing backend specific details), no need to try and provide same functionality/choice across all other clouds/engines - that would be a broken abstraction.
The key to keep our code powerful and still simple, meaningful and maintainable, IMO, is this - once something is not easily/directly generalizable - keep it provider specific, help the user transition to k8s terminology
"rename region to location so it's valid for zones and node selectors"
I'd advise against ๐จ (commented on #412 )
the power of
NonIndexed
Citation needed. ๐ If we provide โpowerfulโ options it should be for a reason, methinks.
this should be in a k8s-specific block imo (yes, exposing backend specific details), no need to try and provide same functionality/choice across all other clouds/engines - that would be a broken abstraction.
We may end up exposing backend-specific details when shoehorning them under a common specification is proven to be difficult or impractical. ๐๐ผ
Although I wonder if this is the case.
The key to keep our code powerful and still simple, meaningful and maintainable, IMO, is this - once something is not easily/directly generalizable - keep it provider specific, help the user transition to k8s terminology
When terminology is [susceptible of being] generalizable and different backends overload the same names with different meanings, I believe that it makes sense to run the extra mile and โgeneralizeโ on our side a bit more.
I'd argue that this particular functionality should be generalized, although in some other cases (e.g. placement) we end up resorting to specific options.
taking away this choice for the perceived "simplicity" of merely saving a field here
Taking away some choices isn't implicitly negative; quite the opposite. Unless we find any supported use case that benefits particularly from NonIndexed
mode, I'd consider not exposing that choice to users.
We discussed this offline a bit, and specifically - I find it hard to defend the choice for NoneIndexed
, except for the religious reason of "give me ma k8s options!"
So I would say we can live without it for now
Ok, I'll make the changes sometime in the next few days
I came across this PR while looking for this feature with AWS EC2. I think the ability to operate parallel instances with regular cloud providers and have some sort of indexing, or any mechanism, to dispatch work to the different instances can greatly help small teams and individual developers who don't have resources to manage k8s.
@redabuspatrol, can you please comment on #585? In the meanwhile, you may also want to try having a separate task
for every chunk of work.
Sorry for the long delay. I've reverted most of the changes, and implemented @0x2b3bfa0 's preferred solution.
Thank you very much, @sjawhar! ๐๐ผ
Thanks so much @sjawhar! Sorry for the delay and (perhaps excessive) hesitancy from our end too.
The reverted changes still look worth pursuing to us, just in separate PRs. We'd probably also be much quicker at understanding & merging said PRs if you could share an example of their use too... I assume related to things like https://github.com/iterative/terraform-provider-iterative/compare/master...sjawhar:terraform-provider-iterative:feature/nfs-volume and https://github.com/iterative/dvc/compare/main...sjawhar:dvc:feature/parallel-repro?
I can open separate issues for each of the following features
completions
separately from parallelism
for k8s jobs (the feature that was rolled back in this PR)region
field to set node selectors for k8s jobscontroller-uid
is queried in k8s jobs. The current method breaks when you add custom tags/labels to the jobs.From the links above, you'll see I've already implemented all of these in the feature/nfs-volume
branch, which I'm using for a project at work. Would be great to get all of them upstream, and I think the only sticking point is how to represent cloud-specific options in Terraform file.
Purpose
Parallel K8s jobs have two modes: Indexed and NonIndexed (default). Indexed jobs have some nice abilities, for example each job gets a
JOB_COMPLETION_INDEX
environment variable so it could assign itself work without needing a separate task queue. There's currently no way to specify the k8s completion mode.Approach
Added an optional boolean
indexed
attributeExample when set to
false
or ommitted:Example when set to
true
TODO