Open pmeier opened 2 years ago
I think this case is implicitly guarded on https://github.com/pytorch/vision/blob/d5bd8b728f14c33b339fc45c90ca39be339bce3f/references/classification/train.py#L87
since len(data_loader.dataset) != num_processed_samples
shouldn't be true on non-distributed setting.
Do you get the error during non-distributed training @pmeier ?
If you don't set the respective env vars
https://github.com/pytorch/vision/blob/d5bd8b728f14c33b339fc45c90ca39be339bce3f/references/classification/utils.py#L255-L258
training will not be distributed and in turn the backend will not be initialized. However, during evaluation we check
https://github.com/pytorch/vision/blob/d5bd8b728f14c33b339fc45c90ca39be339bce3f/references/classification/train.py#L88
unguarded, which then fails with
cc @datumbox