ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
GNU Affero General Public License v3.0
49.89k stars 16.14k forks source link

How to extract features responsible for particular object? #385

Closed bingiflash closed 4 years ago

bingiflash commented 4 years ago


I want to see if we can extract the features that are contributing to the prediction of an object. Is there a way to do it? If so at which layer should I be taking these intermediate features?

Additional context

github-actions[bot] commented 4 years ago

Hello @bingiflash, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

For more information please visit https://www.ultralytics.com.

glenn-jocher commented 4 years ago

@bingiflash you could try a grad-cam style approach. https://keras.io/examples/vision/grad_cam/

Might be an interesting feature to incorporate here. Feel free to submit a PR if your work proves fruitfull!

bingiflash commented 4 years ago

@glenn-jocher Thanks for the fast response. I am hoping to attach a mask head to do instance segmentation, but for that I need a access point feature map in the network. something like this Mask-RCNN But instead of Fast-rcnn, there will be yolo. I am looking for the layer at which the features are passed to detect layer for bounding box prediction.

glenn-jocher commented 4 years ago

@bingiflash when you load a model it shows you exactly which stages are being passed to Detect():

Screen Shot 2020-07-13 at 12 28 48 PM
bingiflash commented 4 years ago

@glenn-jocher Thank you. I'll try that.

github-actions[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

AndreaBrg commented 3 years ago

@bingiflash Sorry to revive this thread but I'm also interested in this feature, did you have any success in incorprorating grad-map in yolov5?

bingiflash commented 3 years ago

@AndreaBrg Sorry, but that grad-map wasn't exactly the reason I wanted to extract features. So i didn't work on it.

rsomani95 commented 3 years ago

@bingiflash Did you have any success with integrating a mask head for instance segmentation? If yes, could you share some insights? Thanks.

Related: https://github.com/ultralytics/yolov5/issues/1123

burhr2 commented 3 years ago

Hi, @bingiflash did you manage to integrate the mask rcnn head? Please give us some insights if it worked or not

cristophersfr commented 3 years ago

Same question as above, any advances on this matter?

Edwardmark commented 3 years ago

Same question as above, any advances on this matter? @glenn-jocher How to get the features corresponding to objects?

glenn-jocher commented 3 years ago

@Edwardmark 'features' corresponding to objects? Every weight and bias in the entire model is responsible for every output to varying degrees, this is the nature of AI and the reason for its performance, so I don't understand your question.

Edwardmark commented 3 years ago

@glenn-jocher I mean each objects can we get the RoI of objects such as faster RCNN, before feed the features to detection head to get bbox and classification scores. For example, we get a car with bbox[10,20, 100, 120] which corresponds to the 1170th anchors in the all anchors, then each anchors can map to a certain RoI of features.

glenn-jocher commented 3 years ago

@Edwardmark in YOLO classification and detections occur simultaneously, unlike in older two stage detectors like faster RCNN.

Edwardmark commented 3 years ago

@glenn-jocher I understand that, but I mean before detection branch(classification and regresion), we can get features for each anchor(that is totally feasible), I just want to extract the feas corresponding to each anchor(according to their space coordinates correspondance), so that I can extract the features to do further work, such as another classification or something.

glenn-jocher commented 3 years ago

@Edwardmark you can extract any intermediate values from the model by updating the module you're interested in, or simply placing code in the model forward function. See yolo.py for the model forward function.

Edwardmark commented 3 years ago

@Edwardmark you can extract any intermediate values from the model by updating the module you're interested in, or simply placing code in the model forward function. See yolo.py for the model forward function.


glenn-jocher commented 3 years ago

@bingiflash @AndreaBrg @Edwardmark good news πŸ˜ƒ! Feature map visualization was added βœ… in PR #3804 by @Zigars today. This allows for visualizing feature maps from any part of the model from any function (i.e. detect.py, train.py, test.py). Feature maps are saved as *.png files in runs/features/exp directory. To turn on feature visualization set feature_vis=True in the model forward method and define the layer you want to visualize (default is SPP layer).


To receive this update:

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 πŸš€!


AndreaBrg commented 3 years ago

Hi, this is great news, thanks @Zigars and @glenn-jocher for setting this up.

Edwardmark commented 3 years ago

@glenn-jocher Thanks for your great work!

besbesmany commented 3 years ago

I searched for feature_vis=True in any .py file I can't find it so how to visualize features in yolov5

glenn-jocher commented 3 years ago

@besbesmany to visualize features:

python detect.py --visualize
caraevangeline commented 2 years ago

Hi @glenn-jocher I am looking for the same "How to extract features responsible for particular object?" I couldn't find the exact answer to my question, sorry if I have missed it somewhere. Does the feature map visualisation give me the heat-map of the important features that help in deciding an object bounding box, I couldn't interpret the heat-map that I got as an output of visualisation

For example, if my model detects a person in an image, I would like to segment only the person features excluding the background pixels. Would this be possible with feature map visualisation?

Thanks in advance!

glenn-jocher commented 2 years ago

@caraevangeline πŸ‘‹ Hello! Thanks for asking about feature visualization. YOLOv5 πŸš€ features can be visualized through all stages of the model from input to output. To visualize features from a given source run detect.py with the --visualize flag:

python detect.py --weights yolov5s.pt --source data/images/bus.jpg --visualize

An example Notebook visualizing bus.jpg features with YOLOv5s is shown below:

Open In Colab Open In Kaggle

Screenshot 2021-08-30 at 16 44 04

All stages are visualized by default, each with its own PNG showing the first 32 feature maps output from that stage. You can open any PNG for a closer look. For example the first 32 feature maps of the Focus() layer output are shown in stage0_Focus_features.png:


Feature maps may be customized by updating the feature_visualization() function in utils/plots.py: https://github.com/ultralytics/yolov5/blob/bb5ebc290e5d630a081d7cbc5a9725ed8cea0a24/utils/plots.py#L403-L427

Good luck πŸ€ and let us know if you have any other questions!

caraevangeline commented 2 years ago

@glenn-jocher Thanks for this, I have looked into this before. I understand I can customise the feature map, but would this give me just the object of interest (heat-map is sufficient) excluding background pixels? I mean, I see the entire image rather than the object itself, so I find it difficult to interpret the feature map

Thanks for your prompt reply!

glenn-jocher commented 2 years ago

@caraevangeline it's really up to you to handle the feature maps however you want, we simply provide a tool for exposing them.

caraevangeline commented 2 years ago

@caraevangeline it's really up to you to handle the feature maps however you want, we simply provide a tool for exposing them.

Thank you @glenn-jocher

boukir commented 1 year ago

Hi @caraevangeline Did you find an answer to your question? how to retrieve just the features that contributed to the detected object ?

Thanks in advance

hariouat commented 1 year ago

Hello everyone, it's possible to visualize the features map during the training?

mohamedsouguir commented 1 year ago

Hello did you find a solution how to extract the feature vector of the detected object?

glenn-jocher commented 1 year ago

@tibbar_upp You can extract the object feature vector using the following steps:

  1. First, load the pretrained YOLOv5s model by calling torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).

  2. Then, run inference on your input image using the loaded model by calling result = model(im), where im is your input image.

  3. From the result object, you can extract the object's class by calling result.pred[0][:, -1].numpy(), where result.pred[0] contains the detected bounding box coordinates and confidence scores for each of the detected objects in the image.

  4. You can then use the torchvision.transforms.functional.crop() function to crop the input image around the detected bounding box, and then apply any desired image processing operations to the cropped image (e.g. resizing, color normalization, etc.).

  5. Finally, to obtain the feature vector for the detected object, you can pass the cropped and processed image through the YOLOv5s model using the model.forward() method, and then extract the desired convolutional feature map from the output of the model.

I hope this helps! Let me know if you have any further questions.

prince0310 commented 1 year ago

`import torch import torchvision.transforms as transforms from PIL import Image from torchvision import models from torchsummary import summary

model = torch.hub.load('ultralytics/yolov5', 'yolov5s-cls', pretrained=True)

model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s-cls.pt') # load from PyTorch Hub

image_path = '/content/yolov5/data/images/bus.jpg' image = Image.open(image_path)

Define the target image size for YOLOv5-small (640x640 pixels)

target_size = (640, 640)

Preprocess the image

preprocess = transforms.Compose([ transforms.Resize(target_size), transforms.ToTensor(), ])

input_data = preprocess(image) input_data = input_data.unsqueeze(0) # Add batch dimension

result = model(input_data)

output_feature_maps = model(input_data, feature_vis=True)

x = model.forward(input_data) print(x.shape) # ------>torch.Size([1, 1000])`

Now i am getting a vector [1,1000], but I want to extract feature before classifier layer .....

glenn-jocher commented 1 year ago

@prince0310 the feature vector you obtained from the YOLOv5 model represents the output of the classifier layer, which consists of 1000 dimensions. If you want to extract features before the classifier layer, you can modify your code as follows:

# Remove the classifier layer from the model
model_without_classifier = torch.nn.Sequential(*(list(model.children())[:-1]))

# Pass the input data through the modified model
features = model_without_classifier(input_data)

print(features.shape)  # Shape of the extracted features

By removing the last layer of the model, you will obtain the feature tensor before the classifier layer, which will have a different shape depending on the specific architecture of the YOLOv5 model you are using.

Let us know if you have any further questions or need additional assistance!

prince0310 commented 1 year ago

Thanks @glenn-jocher. I have implemented the the same. for feature extraction by removing classifier layers. Here I am attaching the link of repos where I have mentioned the implementation for community uses.

Once again I would like to thanks for your reply . Your are amazing.

glenn-jocher commented 1 year ago

@prince0310 thank you for your kind words and glad to hear that you were able to implement feature extraction by removing the classifier layers. Keeping the community informed by sharing your implementation through a repository is a great initiative!

If you have any further questions or need any assistance, feel free to ask. We are always here to help.

prince0310 commented 1 year ago

Hi @glenn-jocher why model_without_classifier.eval() return empty ?

glenn-jocher commented 10 months ago

@prince0310 When calling model_without_classifier.eval(), it doesn't return anything. Instead, it sets the model to evaluation mode, which is commonly used during inference to ensure that all layers in the model are in evaluation mode (e.g. batch normalization layers are not updating their running statistics).

Therefore, calling model_without_classifier.eval() doesn't produce output, but rather prepares the model for evaluation. After this call, you can use the model for inference and feature extraction.

Let me know if you have any other questions or if there's anything else I can assist you with!