Closed YevheniiSemendiak closed 2 years ago
Hello, this is not a neuro-flow error, but a basic neuro API. Let me explain what is happening by example:
neuro run --name somename ...
neuro config switch-cluster ...
neuro run --name somename ...
So you are trying to create multiple jobs with the same name. As job name is interchangeable with job-id, so you can use it non-cluster specific commands such as neuro status
, it is forbidden to have multiple running jobs with the same name even if they are in different clusters.
I think this error happened because we recently added auto-generated names (--name
) for live jobs. To resolve your issue we can:
name
attribute in such cases. Not best UX as for me.So I do not see any 100% good solution here. I would probably prefer (2) as this is how it works for disks. Any thoughts?
Agree with your thoughts, a unique identification of the named job is set of name, owner, (tenant in future), cluster and control plane
(the control plane might just be skipped and become implicit).
Variant 2 sounds reasonable. Regarding the compatibility - we might discuss within the team and decide, whether it worth doing it atm.
As for me, it is not a blocker, or whatever, just some sort of "weird thing".
Other though: if we are trying to enrich the neuro-flow's collaboration capabilities, my suggestion might be a step in the opposite direction: into the global scope, instead of project scope. For NF this logic should probably be different. Summoning MLOps team @anayden and @mariyadavydova, WDYT in context of this issue?
First off, I imagine this issue will not be ever seen by 99% of our users, as they stick to one cluster.
Option (1) seems bad because this will create job hostnames with duplicate cluster string: jobname--username--clustername.jobs.clustername.org.neu.ro and make useful part of the job name even shorter than it is now.
As far as "moving to global scope" idea is concerned, I don't fully understand what it means in this context.
As far as "moving to global scope" idea is concerned, I don't fully understand what it means in this context.
Exactly require the uniqueness of triple name, owner, (tenant in future), cluster
.
@romasku I have another observation of improper behavior in this context. STR:
examples/demo-jobs
folder in this repo.live.yml
, develop job definition remove port forwarding config.Job develop is running, connecting...
Expected: New job instance is runningExample:
Duplicates with https://github.com/neuro-inc/neuro-cli/issues/2442
STR:
Expected: you might launch the same tasks on different clusters