M1 Max compatibility - Githubissues

Pablo-Arias commented 1 year ago

Hi all,

I've been having a fantastic time testing py-feat. It's a really cool package. Thank you to all the developers! I was having an issue using the GPU in my M1max mac book pro while I extract data from a video with the following lines of code:


#New detector
detector = Detector(
    face_model = "retinaface",
    landmark_model = "mobilefacenet",
    au_model = 'xgb',
    emotion_model = "resmasknet",
    facepose_model = "img2pose",
    device="auto"
)
test_video = "example.mp4"
video_prediction = detector.detect_video(test_video, skip_frames=1000)

The error message is RuntimeError: Input type (torch.FloatTensor) and weight type (MPSFloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

This seems to be due to an incorrect load of the data in the GPU (see here), but I don't know if this is because of py-feat of torch.

This is the complete output:


RuntimeError                              Traceback (most recent call last)
Cell In[5], line 1
----> 1 video_prediction = detector.detect_video(test_video, skip_frames=1000)
      2 video_prediction.head()

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/detector.py:795, in Detector.detect_video(self, video_path, skip_frames, output_size, batch_size, num_workers, pin_memory, **detector_kwargs)
    791 faces = self.detect_faces(batch_data["Image"], **detector_kwargs)
    792 landmarks = self.detect_landmarks(
    793     batch_data["Image"], detected_faces=faces, **detector_kwargs
    794 )
--> 795 poses = self.detect_facepose(batch_data["Image"], **detector_kwargs)
    796 aus = self.detect_aus(batch_data["Image"], landmarks, **detector_kwargs)
    797 emotions = self.detect_emotions(
    798     batch_data["Image"], faces, landmarks, **detector_kwargs
    799 )

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/detector.py:492, in Detector.detect_facepose(self, frame, landmarks, **facepose_model_kwargs)
    489 frame = convert_image_to_tensor(frame, img_type="float32") / 255
    491 if "img2pose" in self.info["facepose_model"]:
--> 492     faces, poses = self.facepose_detector(frame, **facepose_model_kwargs)
    493 else:
    494     poses = self.facepose_detector(frame, landmarks, **facepose_model_kwargs)

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/img2pose_test.py:131, in Img2Pose.__call__(self, img_)
    129 poses = []
    130 for img in img_:
--> 131     preds = self.scale_and_predict(img)
    132     faces.append(preds["boxes"])
    133     poses.append(preds["poses"])

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/img2pose_test.py:160, in Img2Pose.scale_and_predict(self, img, euler)
    157     scale = transformed_img["Scale"]
    159 # Predict
--> 160 preds = self.predict(img, border_size=border_size, scale=scale, euler=euler)
    162 # If the prediction is unsuccessful, try adding a white border to the image. This can improve bounding box
    163 # performance on images where face takes up entire frame, and images located at edge of frame.
    164 if len(preds["boxes"]) == 0:

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/img2pose_test.py:188, in Img2Pose.predict(self, img, border_size, scale, euler)
    174 """Runs the img2pose model on the passed image and returns bboxes and face poses.
    175 
    176 Args:
   (...)
    184 
    185 """
    187 # Obtain prediction
--> 188 pred = self.model.predict([img])[0]
    189 # pred = self.model.predict(img)[0]
    190 boxes = pred["boxes"].cpu().numpy().astype("float")

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/img2pose_model.py:103, in img2poseModel.predict(self, imgs)
    100 assert self.fpn_model.training is False
    102 with torch.no_grad():
--> 103     predictions = self.run_model(imgs)
    105 return predictions

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/img2pose_model.py:92, in img2poseModel.run_model(self, imgs, targets)
     91 def run_model(self, imgs, targets=None):
---> 92     outputs = self.fpn_model(imgs, targets)
     93     return outputs

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py:153, in DataParallel.forward(self, *inputs, **kwargs)
    151 with torch.autograd.profiler.record_function("DataParallel.forward"):
    152     if not self.device_ids:
--> 153         return self.module(*inputs, **kwargs)
    155     for t in chain(self.module.parameters(), self.module.buffers()):
    156         if t.device != self.src_device_obj:

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/feat/facepose_detectors/img2pose/deps/generalized_rcnn.py:60, in GeneralizedRCNN.forward(self, images, targets)
     58     original_image_sizes.append((val[0], val[1]))
     59 images, targets = self.transform(images, targets)
---> 60 features = self.backbone(images.tensors)
     61 if isinstance(features, torch.Tensor):
     62     features = OrderedDict([("0", features)])

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torchvision/models/detection/backbone_utils.py:57, in BackboneWithFPN.forward(self, x)
     56 def forward(self, x: Tensor) -> Dict[str, Tensor]:
---> 57     x = self.body(x)
     58     x = self.fpn(x)
     59     return x

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torchvision/models/_utils.py:69, in IntermediateLayerGetter.forward(self, x)
     67 out = OrderedDict()
     68 for name, module in self.items():
---> 69     x = module(x)
     70     if name in self.return_layers:
     71         out_name = self.return_layers[name]

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/conv.py:463, in Conv2d.forward(self, input)
    462 def forward(self, input: Tensor) -> Tensor:
--> 463     return self._conv_forward(input, self.weight, self.bias)

File ~/opt/anaconda3/envs/openAU/lib/python3.9/site-packages/torch/nn/modules/conv.py:459, in Conv2d._conv_forward(self, input, weight, bias)
    455 if self.padding_mode != 'zeros':
    456     return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    457                     weight, bias, self.stride,
    458                     _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
    460                 self.padding, self.dilation, self.groups)

RuntimeError: Input type (torch.FloatTensor) and weight type (MPSFloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

ljchang commented 1 year ago

Thanks for the feedback @Pablo-Arias. We have py-feat setup to be able to potentially use the M1 GPUs. However, this is still a relatively new feature for pytorch and I don't think it is quite ready to be used yet. So far from my understanding, support for M1 GPUs has been broadly added to pytorch last summer, but there are many places where it is not working well yet. For example, some of the base models that are included in pytorch are not yet compatible with M1 GPUs. As many of our detector modules rely on these base models, we are finding that this functionality is not yet ready for prime time. We are hoping that as pytorch developers improve these issues, that py-feat will benefit from this work.

This particular error is actually different from the ones I was encountering this past summer when I was working on it, which indicates that progress is being made in the pytorch package. We would love any help testing this more and potentially identifiying ways in which we could speed up the process of making this work within pyfeat and also potentially sending feedback to the pytorch development team as well.

dheshanm commented 1 year ago

Hi, I am trying to run py-feat on my Linux workstation, and I'm running into the same error. I don't think this issue is limited to M1 GPUs.

When I try to run:

detector = Detector(
        device="cuda",

        face_model=MTCNN, 
        landmark_model=pfld, 
        au_model=xgb, 
        emotion_model=svm,
        facepose_model=img2pose,
        )

    try:
        features = detector.detect_image(
            [image_path], 

            output_size=700,
            )

    # check for errors
    except Exception as e:
        logging.error("Error while extracting features: " + str(e))
        return

with logging.basicConfig(level=logging.INFO) I get the following:

Loading PyFeat models...
INFO:root:Loading Face model: mtcnn
INFO:root:Loading Facial Landmark model: pfld
INFO:root:Loading facepose model: img2pose
INFO:root:Loading AU model: xgb
INFO:root:Loading emotion model: svm
Running PyFeat...
  0%|                                                                                | 0/1 [00:00<?, ?it/s]INFO:root:ImageDataSet: RESCALING WARNING: from torch.Size([3, 2160, 3840]) to output_size=700
/home/<user>/micromamba/envs/pyfeat/lib/python3.11/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
  warnings.warn(
INFO:root:detecting faces...
  0%|                                                                                | 0/1 [00:00<?, ?it/s]
ERROR:root:Error while extracting features: when using a batch_size > 1 all images must have the same dimensions or output_size must not be None so py-feat can rescale images to output_size. See pytorch error: 
Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

When I try detect_video I get the same error. But when I switch to device="cpu" on my Detector, everything works.

I'm running: py-feat==0.6.1

ljchang commented 1 year ago

Thanks for sharing @dheshanm . Do you mind also sharing some details about your system? Particularly the versions of pytorch, torchvision, cuda as well as the type of GPU, GPU driver, and flavor of linux? We haven't seen that particular error yet on our Ubuntu system with an NVIDIA 3090. We have found that GPU support has been very finicky depending the combinations of cuda, pytorch, and gpu drivers.

dheshanm commented 1 year ago

Here is some information about my system that I used to run the code:

py-feat==0.6.1 torch==2.0.1 torchvision==0.15.2

CUDA version: 12.2 GPU type: NVIDIA Quadro K2200 GPU driver version: 535.98 Linux flavor: CentOS 7

cosanlab / py-feat

M1 Max compatibility #159