visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.52k stars 74 forks source link

Add TIMM integration to extract embeddings #284

Closed dnth closed 8 months ago

dnth commented 9 months ago

This PR:

Parameters for TimmEncoder

Usage

import fastdup
from fastdup.embeddings.timm import TimmEncoder

# Compute embeddings
timm_model = TimmEncoder('resnet18')
timm_model.compute_embeddings("../images")

# Run fastdup
fd = fastdup.create(input_dir=timm_model.img_folder)  
fd.run(annotations=timm_model.file_paths, embeddings=timm_model.embeddings)

Here's a bare minimum Colab notebook to try the integration - https://colab.research.google.com/drive/1hDI8SNQU1lhp6d3Q03BhxDk8ILT7dxGl?usp=sharing

I will make another PR for a proper notebook once this integration is merged.

dnth commented 9 months ago

@dbickson @amiralush This PR is ready for review

dnth commented 8 months ago

Merged manually in to version 1.46