keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 331 forks source link

Implemented Coca architecture #2371

Open VarunS1997 opened 9 months ago

VarunS1997 commented 9 months ago

What does this PR do?

Implements the work done in the "CoCa": Contrastive Captioners are Image-Text Foundation Models" (https://arxiv.org/pdf/2205.01917.pdf).

This PR requires:

Before submitting

divyashreepathihalli commented 8 months ago

One additional overhead work is needed.

please add keras_cv/models/feature_extractor/coca to this file https://github.com/keras-team/keras-cv/blob/master/.kokoro/github/ubuntu/gpu/build.sh to line 72 and 86

PS: we will fix this overhead soon, but in the mean time this is what we need to do to make sure the large GPU tests run.

divyashreepathihalli commented 3 months ago

@VarunS1997 will you be completing this one?