MultiWorkerMirrorStrategy for distributed training not working in gpus

Hi I am using MultiWorkerMirrorStrategy and tf.estimator.train_and_evaluate for distributed training with 3 epoch. Please find below the information:

GPU: 4 x NVIDIA Tesla V100
Datasets: COCOA 
Model: Efficientdet-d5
Tensorflow: 2.4.0-gpu

Error when trying to implement this model: Bad status from CompleteGroupDistributed: Failed precondition: Device /job:worker/replica:0/task:1/device:GPU:0 current incarnation doesn't match with one in the group. This usually means this worker has restarted but the collective leader hasn't, or this worker connects to a wrong cluster.

I have changed some of the few lines in main.py file

FYI: Using only train mode

google / automl

MultiWorkerMirrorStrategy for distributed training not working in gpus #964