visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.52k stars 74 forks source link

Add enrichment models - Recognize Anything, Tag2text, Grounding DINO and Segment Anything Model #285

Closed dnth closed 8 months ago

dnth commented 9 months ago

This PR adds zero shot models such as Recognize Anything, Tag2text, Grounding DINO and Segment Anything models as an enrichment option for fastdup.

Here's a Colab notebook to run the following demo - https://colab.research.google.com/drive/1vTac7oTFIZ2eL1v-CDlSJTuCvbi3j0CB?usp=sharing

The enrichment API usage is as follows:

import fastdup
fd = fastdup.create(input_dir='./coco_minitrain_25k')
fd.run()

# Enrich data with labels from Recognize Anything Model (RAM)
df = fd.enrich(task='zero-shot-classification', model='recognize-anything-model', num_rows=5)

Users can continue adding enrichment with other models by repeatedly calling the .enrich method and specifying an input dataframe.

# Enrich data with Grounding DINO model.
df = fd.enrich(task='zero-shot-detection', model='grounding-dino', input_df=df)

The dataframe will contain the enriched data

image

The following is a plot using the labels generated by the models

image

User can continue to enrich data with Segment Anything model to generate masks from the Grounding DINO bounding boxes.

df = fd.enrich(task='zero-shot-segmentation', model='segment-anything', input_df=df)

The results are as follows

image

dnth commented 9 months ago

@dbickson I've implemented your suggestions. Would you please review them?

dbickson commented 9 months ago

@dnth very nice! Two ninor comments. 1) Why read fastdup filenames from folder, when you can run fd.annotations() and get a dataframe with all the filenames? 2) the output of the bounding boxes should be in fastdup consumable format namely columns of filename, row_y, col_x, width, hight => one per row, this will allow running fastdup again on the annotated bb for further analysis without any conversions

dnth commented 9 months ago

@dnth very nice! Two ninor comments.

  1. Why read fastdup filenames from folder, when you can run fd.annotations() and get a dataframe with all the filenames?
  2. the output of the bounding boxes should be in fastdup consumable format namely columns of filename, row_y, col_x, width, hight => one per row, this will allow running fastdup again on the annotated bb for further analysis without any conversions

Hi @dbickson as a user, I would like to run the enrichment first and then run fastdup on top of the enriched labels. That is the reason why I read the filenames from a folder.

I imagine the flow would look like

import fastdup
from fastdup.utils import get_images_from_path
import pandas as pd

# Set path and filenames
fd = fastdup.create(input_dir='./coco_minitrain_25k')
filenames = get_images_from_path(fd.input_dir)
df = pd.DataFrame(filenames, columns=["filenames"])

# Enrich
df = fd.enrich(task='zero-shot-classification', model='recognize-anything-model', num_rows=5, input_df=df)

This currently throws a runtime error because fastdup expects users to run first.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[4], line 1
----> 1 df = fd.enrich(task='zero-shot-classification', model='recognize-anything-model', num_rows=5)

File ~/anaconda3/envs/mmdet-integration/lib/python3.10/site-packages/fastdup/fastdup_controller.py:1333, in FastdupController.enrich(self, task, model, input_df, user_tags, num_rows, input_col, output_col, device)
   1330 def enrich(self, task, model, input_df=None, user_tags="None", num_rows=None, input_col=None, output_col=None, device=None):
   1332     if input_df is None:
-> 1333         df = self.annotations(valid_only=True)
   1334         df.drop(["index", "error_code", "is_valid", "fd_index"], axis=1, inplace=True)
   1335     else:

File ~/anaconda3/envs/mmdet-integration/lib/python3.10/site-packages/fastdup/fastdup_controller.py:197, in FastdupController.annotations(self, valid_only)
    191 """
    192 Get annotation as data frame
    193 :param valid_only: if True, return only valid annotations
    194 :return: pandas dataframe
    195 """
    196 if not self._fastdup_applied:
--> 197     raise RuntimeError('Fastdup was not applied yet, call run() first')
    198 df_annot = self._df_annot.query(f'{FD.ANNOT_VALID}') if valid_only and self._df_annot is not None \
    199     else self._df_annot
    200 return df_annot

RuntimeError: Fastdup was not applied yet, call run() first

THis was the reason why I ran fastdup in the Colab notebook.

In the end there are two options for users:

I would prefer option B. What do you think?

dnth commented 9 months ago

@dbickson I've added a utility function that converts the bounding boxes into COCO json format which can be consumed with fastdup.

Usage

from fastdup.models.utils import convert_to_coco_format
convert_to_coco_format(df, bbox_col='grounding_dino_bboxes', label_col='grounding_dino_labels', json_filename='grounding_dino_annot_coco_format.json')

After that, users can conveniently use them in fastdup

import fastdup
fd = fastdup.create(input_dir="./")
fd.run(annotations="grounding_dino_annot_coco_format.json", overwrite=True)

Users can get the fastdup annotations format by running fd.annotations() image

Output when running fd.vis.similarity_gallery()

image

The above Colab notebook is now updated with all changes.

dnth commented 9 months ago

Additionally, I've also included the plotting function in fastdup.models.utils

from fastdup.models.utils import plot_annotations
plot_annotations(df, image_col='filename', tags_col='ram_tags', bbox_col='grounding_dino_bboxes', scores_col='grounding_dino_scores', labels_col='grounding_dino_labels', num_rows=10)

image

plot_annotations(df, image_col='filename', tags_col='ram_tags', bbox_col='grounding_dino_bboxes', scores_col='grounding_dino_scores', labels_col='grounding_dino_labels', masks_col='sam_masks', num_rows=5)

image

dnth commented 8 months ago

Merged manually into version 1.46