Object detection explainability

caglayantuna commented 5 months ago

Module

Attributions Methods

Current Behavior

Dear Deel-ai team,

Thanks for this library, it is very well structured and very educative. However, I have 2 questions or problems when I use your library.

1- When I run OD notebook, it takes more than 40 minutes. Processing time is very long especially for Occlusion and Rise methods. Is it normal?

2- I am trying to apply your attribution methods with my Ultralytics Yolov5 model. Unfortunately, I get a "grad_fn" error when I apply gradient-based methods such as Saliency. With gradient-free methods such as Occlusion, it is very long as Colab notebook and I couldn't get any meaningful result for now. I checked on similar issue #157 but I couldn't find a solution. Do you have any experience with these models?

Expected Behavior

It would be nice to run attribution methods for ultralytics OD models without any problem.

Version

1.3.3

Environment

- OS: Windows
- Python version: 3.12
- Tensorflow version: 2.16.1
- Packages used version:

Relevant log output

RuntimeError                              Traceback (most recent call last)
Cell In[17], line 7
      5 for i in range(1):
      6   torch.cuda.empty_cache()
----> 7   explanation = explainer.explain(processed_tf_inputs, man_bounding_box)

File ~\AppData\Roaming\Python\Python312\site-packages\xplique\attributions\base.py:32, in sanitize_input_output.<locals>.sanitize(self, inputs, targets, *args)
     30 inputs, targets = tensor_sanitize(inputs, targets)
     31 # then enter the explanation function
---> 32 return explanation_method(self, inputs, targets, *args)

File ~\AppData\Roaming\Python\Python312\site-packages\xplique\attributions\base.py:221, in WhiteBoxExplainer._harmonize_channel_dimension.<locals>.explain(self, inputs, targets)
    196 def explain(self,
    197             inputs: Union[tf.data.Dataset, tf.Tensor, np.array],
    198             targets: Optional[Union[tf.Tensor, np.array]] = None) -> tf.Tensor:
    199     """
    200     Compute the explanations of the given inputs.
    201     Accept Tensor, numpy array or tf.data.Dataset (in that case targets is None)
   (...)
    219         Explanation generated by the method.
    220     """
--> 221     explanations = explain_method(self, inputs, targets)
    223     if len(explanations.shape) == 3 and len(inputs.shape) == 4:
    224         explanations = tf.expand_dims(explanations, axis=-1)
...
    272     allow_unreachable=True,
    273     accumulate_grad=True,
    274 )

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

To Reproduce


from xplique.attributions import (Saliency, GradientInput, IntegratedGradients, SmoothGrad, VarGrad, SquareGrad,
                                  Occlusion, Rise, SobolAttributionMethod, HsicAttributionMethod)
from xplique.wrappers import TorchWrapper
import torch
import torch.nn as nn
from xplique.attributions import HsicAttributionMethod, Occlusion, Saliency

class ModelWrapper(torch.nn.Module):
    # WARNING: `torch.nn.Module` specific to pytorch
    # `tf.keras.Model` instead for tensorflow models

    def __init__(self, model):
        super(ModelWrapper, self).__init__()
        self.model = model.eval()

    #def __call__(self, torch_inputs):
    #    # this method should change depending on the model
    #    predictions = self.model(torch_inputs)
    #    return torch.stack([self.format_predictions(pred) for pred in predictions], dim=0)

    def __call__(self, torch_inputs):
        predictions = self.model(torch_inputs)
        result = self.model(torch_inputs)
        print("Predictions shape before NMS:", result[0].shape)
        predictions = non_max_suppression(result, 0.25)[0]
        #predictions = self.transform_input(predictions)
        return torch.stack([self.format_predictions(pred) for pred in [predictions]], dim=0)

    def format_predictions(self, predictions):
        # format prediction for them to match Xplique object detection operator
        # a single tensor of shape (nb_boxes, 4 + 1 + nb_classes)
        # box coordinates defined by (x1, y1, x2, y2) respectively (left, bottom, right, top).
        return torch.cat([predictions[:,:4],
                          predictions[:, 5].unsqueeze(dim=1),
                          # WARNING ! The user should use the class logits and not ones
                          torch.ones((predictions[:, 5].shape[0], 1)).to(predictions[:, 5].device)],
                         dim=1)

    def transform_input(input_tensor):
        # Assuming input_tensor is of shape [N, 6] and format [x_min, y_min, x_max, y_max, score, class_id]

        # Separate the bounding boxes, scores, and class_ids
        boxes = input_tensor[:, :4]  # First 4 columns are the bounding box coordinates
        scores = input_tensor[:, 4]  # 5th column is the score
        labels = input_tensor[:, 5].long()  # 6th column is the class_id, converted to long for PyTorch compatibility

        # Wrap boxes, scores, and labels into a dictionary to match the desired output format
        return {
        'boxes': boxes,
        'scores': scores,
        'labels': labels
    }

    def to(self, device):
        # WARNING: specific to pytorch
        self.model.to(device)
        return self

    def zero_grad(self):
        # WARNING: specific to pytorch
        self.model.zero_grad()
        return self
object_detection_model = ModelWrapper(model).eval()
wrapped_model = TorchWrapper(object_detection_model, device, is_channel_first=True)
tf_predictions = wrapped_model(processed_tf_inputs)
man_bounding_box = tf_predictions[:, 0]

explainer = Saliency(wrapped_model, operator=xplique.Tasks.OBJECT_DETECTION, batch_size=16)
with torch.no_grad():
  for i in range(1):
    torch.cuda.empty_cache()
    explanation = explainer.explain(processed_tf_inputs, man_bounding_box)

AntoninPoche commented 5 months ago

Dear @caglayantuna,

Thank you very much for your interest in our library. Regarding your questions and problems:

Rise and Occlusion methods can indeed be time-consuming as they make many inferences on many perturbed samples to estimate important pixels. Furthermore, the free GoogleColab can be slow when many inferences are needed. Nonetheless, I have several solutions:
- If the GPU supports it, you can increase the batch_size parameter.
- Another possible direction toward time optimization is to treat several images at the same time by providing all the images you want to explain at the same time.
- On methods parameters setting, you can follow the Rise tutorial and Occlusionb tutorial to guide on how to reduce the time of these methods. To summarize, for Rise, you can try to reduce nb_samples parameter, if the results are too noisy, then you can reduce grid_size, but explanations may not stay precise enough, it is a trade-off. Regarding Occlusion, you can try to increase batch_stride, but batch_stride should stay a multiple of batch_stride.
- You can try to use SobolAttributionMethod or HsicAttributionMethod; these methods use more optimized sampling and thus require fewer inferences.
In your code, you called explainer.explain in a torch.no_grad() context. I tried adding this line in the tutorial, and I obtained the same error as the one you mentioned. Thus, I suggest that you remove it and hope it will solve your issue.
I would like to make a remark on the wrapper you used for the model: the NMF function should be removed from your model inference in an explainability setting. You can have it for inference to select the instances to explain, but it should be removed from the model you explain. More information and justifications can be found in the object detection documentation in practice section.

I hope this allows you to apply both perturbation-based and gradient-based methods on Ultralytics' models.

Best regards, Antonin

caglayantuna commented 5 months ago

Hi Antonin,

Thanks for your answer it helped me a lot. I was waiting to try everything you suggested to me. To be honest, I couldn't managed to run gradient-based methods with my model. However, by coincidence, I met with Thomas Fel and he told me that pytorch could be the reason for this issue. He said that, the library will be improved soon to solve this issue

As you suggested to me, I tried Rise with reduced number of samples and it really worked very well for my OD task. Thank you very much :)

Best,

deel-ai / xplique