Closed tadejsv closed 3 years ago
Hi @tadejsv,
Can you please assign this to me?
@vivek2301 sure, go ahead. Please check out this file (our new contributor guide, will be merged to master soon)
https://github.com/jina-ai/executors/blob/b6c33e689326d1ac7888ce32aaa27ef66047aa78/CONTRIBUTING.md
Thanks. I'll follow the contributor guide.
Hi, @tadejsv, @vivek2301 , can i also work on this issue? Also, is the encoder already made, or need to bluid a new one.?
Hi @VIDIT-OSTWAL , for now @vivek2301 will be the one working on it, as I think this is a task suitable for a single person. If anything comes up, we'll let you know
Cool, np.
@vivek2301 are you working on this currently? If not, please let us know, so that I can assign the task to someone else
@tadejsv yes, I'm working on this. It took some time to understand jina's framework and going through the cookbook and contribution guidelines. I've already completed a part of the code. I'm currently looking at the various models in timm as I need to select the respective layers from it for the encoding. Timm has the following modules: ['byoanet', 'byobnet', 'cait', 'coat', 'convit', 'cspnet', 'densenet', 'dla', 'dpn', 'efficientnet', 'ghostnet', 'gluon_resnet', 'gluon_xception', 'hardcorenas', 'hrnet', 'inception_resnet_v2', 'inception_v3', 'inception_v4', 'levit', 'mlp_mixer', 'mobilenetv3', 'nasnet', 'nfnet', 'pit', 'pnasnet', 'regnet', 'res2net', 'resnest', 'resnet', 'resnetv2', 'rexnet', 'selecsls', 'senet', 'sknet', 'swin_transformer', 'tnt', 'tresnet', 'twins', 'vgg', 'visformer', 'vision_transformer', 'vision_transformer_hybrid', 'vovnet', 'xception', 'xception_aligned']
Do I need to build the encoder for all the models or some subset of them?
@vivek2301 , great - then you can open a draft PR, and keep working on it. It doesn't have to be finished, but it will help us track the progress.
As for the models - I think user can simply pass the "model_name" string, and timm builds the correct model automatically. This is nothing we need to implement ourselves, the executor should be model-agnostic
Sure, I will open the PR. Yes, it does not need to be built but the specific layer to get the encoding needs to be selected for each module at least. I think it's the same for PyTorch encoder: ImageTorchEncoder
I will try and complete this as soon as possible.
@vivek2301 I think this is not necessary - as I show in the example above, timm has a unified interface for all models, and with proper settings in create_model
you get the pooled last layer features with the normal call to the object
@tadejsv Awesome, thanks for this. Timm does not list num_classes as a parameter in create_model and I missed this. You saved me a lot of time, thanks.
Please check out its documentation, it has a section on feature extraction, which explains all of this
timm is the largest library of image models. Not only that, it also has a unified and very simple to use interface, that stays the same across models. For example,
will work with when replacing
my_model
withresnet34
orvit_large_patch16_384
- two completely different model architectures.This is an extremely low hanging fruit, I am shocked it has not been implemented yet (especially since some other models - like BigTransfer, that we have implemented, are just a small subset of what's available in this library).