alan-turing-institute / ARC-LoCoMoSeT

Low-Cost Model Selection for Transformers
MIT License
1 stars 0 forks source link

Select a pool of models, all pre-trained on the same dataset #5

Closed jack89roberts closed 5 months ago

jack89roberts commented 1 year ago

For phase 1:

jack89roberts commented 1 year ago

Some have nice tables comparing the variants of their models, e.g.

Watch out for some models being fine-tuned on ImageNet 1k and some on ImageNet 21/22k

jack89roberts commented 1 year ago

Base classes relevant for the above (the forward methods can help to identify what layer/module we want to take outputs from, for example):

Maybe also the WithTeacher classes (to investigate), e.g.

All use self.<ARCHITECTURE_NAME> call followed by a self.classifier call with some kind of processing/slicing between to either select the first row of the output (e.g. corresponding to the input <cls> token in ViT), or to pool/mean across the outputs per patch in some way. The WithTeacher classes have an additional distillation_classifier.

We want the processed sequence_output/pooled_output values as our features for most the metrics.

eddableheath commented 12 months ago

Putting Jack's note from slack here for posterity:

Looking at the HuggingFace source code I think setting num_labels=0 when loading the models might give us the features without needing to implement anything else ourselves

jack89roberts commented 11 months ago

Removing this from WP1 now. We have some initial results with:

Need to pick a bigger pool of models for the later work packages.

eddableheath commented 8 months ago

Collection of every transformers model pretrained on ImageNet from the huggingface transformers library:

Google:

Facebook:

Microsoft

Apple

Intel

Visual-Attention-Network

Matthijs

optimum

fxmarty

shi-labs

Xrenya

Snarci

MBZUAI

shehan97

Zetatech

tensorgirl

grlh11

eddableheath commented 8 months ago

~ 80 models listed above. Do we have the scope to fine-tune all of them on every dataset we pick?

And do we have stipulations on the source of these models other than compatability with our framework? I've ordered the sources (i.e. where these models are from and who is responsible to uploading them to the model hub) with well known companies first.

eddableheath commented 5 months ago

Should this be closed?