Closed wdbaruni closed 5 days ago
[!IMPORTANT]
Review skipped
Auto reviews are disabled on this repository.
Please check the settings in the CodeRabbit UI or the
.coderabbit.yaml
file in this repository. To trigger a single review, invoke the@coderabbitai review
command.You can disable this status message by setting the
reviews.review_status
tofalse
in the CodeRabbit configuration file.
For some reason, timed-out executions were marked as cancelled instead of failed, which is wrong. Also this resulted in compute node calling
OnCancelComplete
on the requester node, which is a noop. This means the requester node will only mark the execution as failed when the housekeeper kicks in which has a buffer of 2 minutes, instead of as soon as the failure is reported by the compute node.Previously, this job will be marked as failed after 2-2:30 minutes:
With this change it will marked as failed in ~10 seconds