Open gl-001 opened 7 months ago
"the object has been modified; please apply your changes to the latest version and try again"
This is a well-known client-side apply issue. However, this error doesn't raise any bugs.
/close
@tenzen-y: Closing this issue.
this error will lead the job status Not Credible, which job will long time is running, but the pods was succeed. If there is a job after the mpi job in a pipeline, then the job will not be processed after waiting a long time. So are there any methods to solve this problem? @tenzen-y
Completion Time: 2023-11-27T03:34:09Z
Conditions:
Last Transition Time: 2023-11-27T03:31:
Last Update Time: 2023-11-27T03:31:20Z
Message: MPIJob a5qvbedvqod1-mpijob is created.
Reason: MPIJobCreated
Status: True
Type: Created
Last Transition Time: 2023-11-27T03:34:09Z.
Last Update Time: 2023-11-27T03:34:09Z
Message: Job has reached the specified backoff limit
Reason: BackoffLimitExceeded
Status: True
Type: Failed
Last Transition Time: 2023-11-27T03:34:09Z
Last Update Time: 2023-11-27T03:34:09Z
Message: MPIJob a5qvbedvqod1-mpijob is running.
Reason: MPIJobRunning
Status: True
Type: Running [will live a long time]
Replica Statuses:
Launcher:
Failed: 1
Worker:
Start Time: 2023-11-27T03:31:20Z
/reopen
@tenzen-y: Reopened this issue.
/kind support
@tenzen-y: The label(s) kind/support
cannot be applied, because the repository doesn't have them.
this error will lead the job status Not Credible, which job will long time is running, but the pods was succeed. If there is a job after the mpi job in a pipeline, then the job will not be processed after waiting a long time. So are there any methods to solve this problem? @tenzen-y
Completion Time: 2023-11-27T03:34:09Z Conditions: Last Transition Time: 2023-11-27T03:31: Last Update Time: 2023-11-27T03:31:20Z Message: MPIJob a5qvbedvqod1-mpijob is created. Reason: MPIJobCreated Status: True Type: Created Last Transition Time: 2023-11-27T03:34:09Z. Last Update Time: 2023-11-27T03:34:09Z Message: Job has reached the specified backoff limit Reason: BackoffLimitExceeded Status: True Type: Failed Last Transition Time: 2023-11-27T03:34:09Z Last Update Time: 2023-11-27T03:34:09Z Message: MPIJob a5qvbedvqod1-mpijob is running. Reason: MPIJobRunning Status: True Type: Running [will live a long time] Replica Statuses: Launcher: Failed: 1 Worker: Start Time: 2023-11-27T03:31:20Z
@gl-001 Sorry for the late response. IIUC, if the update process fails, the controller will retry to update MPIJob. Can you share the mpi-operator logs with us?
version 0.4.0 anyone occurred this problem? i add some log and found doUpdateJobStatus function will raise "the object has been modified; please apply your changes to the latest version and try again" thx