Description
The clustering algorithm leaves the parts of the masked images that aren't part of the clustered class black. This makes it incredibly hard for Google to classify them. Using some clever algorithm to remove these black spots can help a lot.
To Reproduce
Steps to reproduce the behavior:
Put this image in your script directory:
Run this:
from MAGIST.Vision.UnsupervisedModels.img_cluster import RoughCluster
from MAGIST.Utils.WebScraper.google import GoogleScraper
scraper = GoogleScraper("config.json")
labels = []
for i in imgs:
label = scraper.reverse_image_search(i)
labels.append(label)
print(labels)
**Expected behavior**
Ideally, it should mask out nearby, similar pixels but it doesn't. This leaves massive, unstructured black gaps that cause a lot of issues when reverse searching.
**Screenshots**
![image](https://user-images.githubusercontent.com/85193239/175092667-d53d902e-c2f9-486c-a252-efb83898ad83.png)
![image](https://user-images.githubusercontent.com/85193239/175092774-83e79f7d-26a9-4d15-b571-404ff14e6fb6.png)
**Additional context**
Google generally returns `language` or `night` when there is too much black. The solution would be something like this:
1. Find each masked image's edge pixels.
2. Compute all possible lines that can be formed from edge pixels.
3. Crop image at that line if all pixels in the line are black.
This is super computationally intensive, however.
Description The clustering algorithm leaves the parts of the masked images that aren't part of the clustered class black. This makes it incredibly hard for Google to classify them. Using some clever algorithm to remove these black spots can help a lot.
To Reproduce Steps to reproduce the behavior:
Put this image in your script directory:
Run this:
cluster = RoughCluster("config.json")
imgs = cluster.unsupervised_clusters(6, "Input.jpg", (150, 150), "Clusters")
from MAGIST.Utils.WebScraper.google import GoogleScraper
scraper = GoogleScraper("config.json")
labels = []
for i in imgs: label = scraper.reverse_image_search(i) labels.append(label)
print(labels)