Open sparshev opened 2 years ago
We need to have some limits on failure of the label to inform the cluster that this node can not execute the particular label or maybe the entire driver due to the fails in allocating.
Right now if it fails - it will continue to fail.
We need to have some limits on failure of the label to inform the cluster that this node can not execute the particular label or maybe the entire driver due to the fails in allocating.
Right now if it fails - it will continue to fail.