What

Add a construct CustomModel which allows using custom feature extractor. (Check example notebook: use_custom_model.ipynb for detailed usage.)
Add Efficientnet and Vision transfomer cnn models in addition to the default MobilenetV3 for feature extraction.

Why

MobilenetV3 does not represent state-of-the-art anymore when it comes to image classification tasks as evidenced by their performance on imagenet dataset. EfficientNet and Vision transformers instead form SOTA. ( MobilenetV3 still remains as the default cnn model)
The CNN ecosystem has many promising models out- several hosted on torchhub, huggingface modelhub, etc. It would be great to allow the community to leverage a cnn model of their choice for their usecase.

How

A construct CustomModel has been added which can be passed to a new cnn constructor argumentmodel_config. CustomModel accepts any torch module that can generate features. (a corresponding transform must also be provided that takes in a PIL image and outputs a pytorch tensor. The model must produce abatch_size x features pytorch tensor). It can be imported from imagededup.utils.
EfficientNet and ViT models have been added to imagededup.utils.models

Choice of models

EfficientNet and ViT have been added with the following in mind:

Availability on torchvision.models subpackage.
Size of the model- wanted to avoid packaging models that are too large (too many params = large memory requirements and increased feature generation times.)
SOTA on imagenet dataset.

The CustomModel can also use models that are not hosted on torchvision. To see an example of using a model from huggingface model hub, check out the example notebook use_custom_model.ipynb.

idealo / imagededup

Add ability to use custom CNN models #190

What

Why

How

Choice of models