Is your feature request related to a problem? Please describe.
At the moment, only CLIP models are supported. These are good models and work across language and text. However, there are lots of other models in timm that are SOTA in classification and can still provide good embedding. They also span a large range of sizes and architectures so offer good accuracy/latency trade-offs.
Describe the solution you'd like
A new class of timm models can be specified for the "model" field at index creation time.
Is your feature request related to a problem? Please describe. At the moment, only CLIP models are supported. These are good models and work across language and text. However, there are lots of other models in timm that are SOTA in classification and can still provide good embedding. They also span a large range of sizes and architectures so offer good accuracy/latency trade-offs.
Describe the solution you'd like A new class of timm models can be specified for the "model" field at index creation time.
Describe alternatives you've considered None
Additional context https://github.com/rwightman/pytorch-image-models