Closed james77777778 closed 3 months ago
@james77777778 I was thinking we would make FeaturePyramidBackbone
a simple subclass of Backbone
for most CV backbones.
So...
Backbone
is basically just a functional model with from_preset
.FeaturePyramidBackbone
extends Backbone
with extra pyramid_outputs
.ResNetBackbone
extends FeaturePyramidBackbone
directly.The goal is to keep the Backbone
base class as clean as we can, now that we are venturing into multi modal models with a lot of different overall patterns. dir(bert_backbone)
shouldn't have feature pyramid stuff in it. If we could move the token_embedding
off the backbone and into a TextBackbone
or similar without breaking compat we probably would too.
Most CV models like ResNet, DenseNet, EfficientNet, etc, will be FeaturePyramidBackbone
s. But something like ViT can just subclass Backbone
directly without needing the feature pyramid outputs.
WDYT?
@mattdangerw @divyashreepathihalli Got it. That makes sense. Updated!
I have updated the PR to make compatible with timm
. Additionally, the conversion logic has been added, similar to how it’s done in transformers
.
Please refer to this colab for the numerical check: https://colab.research.google.com/drive/1QnmNDiFYd56fsYoaUM46QRT4gF9G06fH?usp=sharing
Supported:
Looks really good! Its awesome to have built in timm conversion! LGTM!
Thanks!
This PR introduces
FeaturePyramidBackbone
, a wrapper forBackbone
that addspyramid_outputs
property. If a vision backbone supports feature pyramids, it should subclassFeaturePyramidBackbone
.I modified
ResNetBackbone
by subclassingFeaturePyramidBackbone
to include feature pyramid information.Also,
head_dtype
is added inResNetImageClassifier
.@divyashreepathihalli @mattdangerw @SamanehSaadat
EDITED: See https://github.com/keras-team/keras-nlp/pull/1769#issuecomment-2287967827 for updates