Add TIMM integration to extract embeddings

dnth commented 9 months ago

This PR:

Adds the TIMM integration to fastdup to let users extract embeddings using any models on TIMM.
Add tests for the TIMM integration.
Add sample images to run the tests on.

Parameters for TimmEncoder

model_name (str): The name of the model architecture to use.
num_classes (int): The number of classes for the model. Use num_features=0 to exclude the last layer. Default: 0.
pretrained (bool): Whether to load pretrained weights. Default: True.
device (str): Which device to load the model on. Choices: "cuda" or "cpu". Default: None.
torch_compile (bool): Whether to use torch.compile to optimize model. Default False.

Usage

import fastdup
from fastdup.embeddings.timm import TimmEncoder

# Compute embeddings
timm_model = TimmEncoder('resnet18')
timm_model.compute_embeddings("../images")

# Run fastdup
fd = fastdup.create(input_dir=timm_model.img_folder)  
fd.run(annotations=timm_model.file_paths, embeddings=timm_model.embeddings)

Here's a bare minimum Colab notebook to try the integration - https://colab.research.google.com/drive/1hDI8SNQU1lhp6d3Q03BhxDk8ILT7dxGl?usp=sharing

I will make another PR for a proper notebook once this integration is merged.

dnth commented 9 months ago

@dbickson @amiralush This PR is ready for review

dnth commented 8 months ago

Merged manually in to version 1.46

visual-layer / fastdup

Add TIMM integration to extract embeddings #284