Add `CLIP` model - Githubissues

keras-team / keras-hub

Pretrained model hub for Keras 3

Apache License 2.0

804 stars 243 forks source link

Related to #1752

Preset: openai/clip-vit-base-patch32 (should work for all CLIP models)

Inputs:	Image	Text
	"a photo of a cat", "a photo of a dog"

Outputs (probability):

There are some questions about the upcoming task definition:

What type of Task should we introduce? The original FeatureExtractor seems somewhat ambiguous to me.
Should we modify CLIPPreprocessor to accept image-text pairs for the new CLIPBackbone or new task? If so, we would need to update the SD3 implementation as well.

cc @divyashreepathihalli

keras-team / keras-hub