In AI section, I tried using GroundingDino tiny and YOLOWorld for zero-shot detection, but they gave unreliable results - prompt 'single_metal_part' worked better for some images, while 'small_part' worked better for some images (specific classes like 'screws', 'bolts' didn't seem to do well), but even then it could not detect correctly in images having too many objects, even at very low confidence thresholds (1e-3/1e-4). So I decided to fine-tune a model for these objects. Since the given dataset is small and manual annotation is very time-consuming, I used an external dataset - MVTec Screws. It has all the kinds of objects with minor differences, though images here have very different backgrounds, and also the resolution is smaller.
Considering the limitations of time and compute, I chose Yolov8-nano. I converted the dataset to yolo format and trained the model for 100 epochs.
It can recognise most objects, but mostly still cannot reach the accuracy of 95% (by visual inspection of results) and produces duplicate detections.
Non_AI
In Non AI section, I have used only morphological operations, hough transforms and contour detection to detect the individual objects. First, I thresholded the image and then applied some opening, closing to remove the noise. Then, employed 2 different techniques - one for detecting screws/bolts and another for nuts. For screws I perform dilation followed by thresholding of distance transform to get peak centre patches for each object (This can help in separating some of objects which are in contact). Then find the contours and their bounding boxes for each of these patches. For nuts, I find the inner contours and then use Hough transform circle detection to detect the holes at the centre of nuts (The previous approach failed pretty badly because of relatively tighter packing in images with nuts)
Additional Comments
Please check Suhas_Gopal/Solution.md for more detailed description and results.
Info
Name
Suhas Gopal
Python Version
Python 3.10
Description
AI
In AI section, I tried using GroundingDino tiny and YOLOWorld for zero-shot detection, but they gave unreliable results - prompt 'single_metal_part' worked better for some images, while 'small_part' worked better for some images (specific classes like 'screws', 'bolts' didn't seem to do well), but even then it could not detect correctly in images having too many objects, even at very low confidence thresholds (1e-3/1e-4). So I decided to fine-tune a model for these objects. Since the given dataset is small and manual annotation is very time-consuming, I used an external dataset - MVTec Screws. It has all the kinds of objects with minor differences, though images here have very different backgrounds, and also the resolution is smaller.
Considering the limitations of time and compute, I chose Yolov8-nano. I converted the dataset to yolo format and trained the model for 100 epochs. It can recognise most objects, but mostly still cannot reach the accuracy of 95% (by visual inspection of results) and produces duplicate detections.
Non_AI
In Non AI section, I have used only morphological operations, hough transforms and contour detection to detect the individual objects. First, I thresholded the image and then applied some opening, closing to remove the noise. Then, employed 2 different techniques - one for detecting screws/bolts and another for nuts. For screws I perform dilation followed by thresholding of distance transform to get peak centre patches for each object (This can help in separating some of objects which are in contact). Then find the contours and their bounding boxes for each of these patches. For nuts, I find the inner contours and then use Hough transform circle detection to detect the holes at the centre of nuts (The previous approach failed pretty badly because of relatively tighter packing in images with nuts)
Additional Comments
Please check Suhas_Gopal/Solution.md for more detailed description and results.