DataParallel problem and Confusion about how to choose train/test dataset combination with different detectors

VincentVanNF commented 1 year ago

when set ngpu parameter in .yaml more than 1,there is something wrong when trainning.As code says，it use Pytorch DataParallel to do the muti-gpus parallelism job. But some code is not written properly:

losses = self.model.get_losses(data_dict, predictions)

batch_metrics = self.model.get_train_metrics(data_dict, predictions)

metric_one_dataset = self.model.get_test_metrics()

As when use muti-gpus, self.model is DataParallel . Maybe should change to this:

if self.config['ngpu'] > 1:
      losses = self.model.module.get_losses(data_dict, predictions)
else:
      losses = self.model.get_losses(data_dict, predictions)

if self.config['ngpu'] > 1:
   batch_metrics = self.model.module.get_train_metrics(data_dict, predictions)
else:
    batch_metrics = self.model.get_train_metrics(data_dict, predictions)

if self.config['ngpu'] > 1:
   metric_one_dataset = self.model.module.get_test_metrics()
else:
    metric_one_dataset = self.model.get_test_metrics()

When I train use this script to train,it will throw dimmesion error when concate Tensors.The train/test dataset in xception.yaml are both [FaceForensics++],but when I set train dataset to [FaceForensics++],test dataset to [Celeb-DF-v1] this error disappears. Are there some rules about the combination of train/test dataset for different detectors? If so , can you make a list of these combination for different detectors training. If not, then what causes this problem. Thanks.
```
python train.py \
--detector_path ./config/detector/xception.yaml
```

YZY-stack commented 1 year ago

Hi,

I want to bring to your attention that our current codes do not support multiple GPUs. Therefore, it is necessary to use a single GPU for training purposes. As for your second question, I have already retrained the code using a single GPU, and I did not encounter any issues during the process. Additionally, the way you have set the train set to [FaceForensics++] and the test dataset to [Celeb-DF-v1] appears to be correct.

If you have any further questions or concerns, please feel free to let me know. I'm here to assist you with any training or dataset-related matters.

Thank you for your understanding and cooperation.

VincentVanNF commented 1 year ago

Thanks for you reply.I mean train set to [FaceForensics++] and the test dataset to [Celeb-DF-v1] has no problem,But if train/test dataset are both [FaceForensics++] will cause issues mentioned above.It's confusing,what may probaly cause this issuse

YZY-stack commented 1 year ago

Thank you for your valuable suggestion. I will clarify and list the combinations of the training set and testing set for each detector.

SCLBD / DeepfakeBench

DataParallel problem and Confusion about how to choose train/test dataset combination with different detectors #14