[T5 1.1] Enable v1.1 Presets

DavidLandup0 commented 1 month ago

Turns out we already operationally supported T5 1.1 (given the gated activations) but only supported vanilla T5 models through weight conversion and presets.

This PR updates the conversion script to include the T5 1.1 variants:

google/t5-v1_1-small
google/t5-v1_1-base
google/t5-v1_1-large
google/t5-v1_1-xl
google/t5-v1_1-xxl

For example:

t5_small = keras_hub.models.T5Backbone.from_preset("t5_1.1_small")
tokenizer = keras_hub.models.T5Tokenizer.from_preset("t5_1.1_small")

It also updates the conversion to use the save_to_preset() functionality and fixes assertions that raised exceptions, and saves the tokenizer as well.

Numerical Equivalence

small = keras_hub.models.T5Backbone.from_preset("t5_1.1_small")
keras_tokenizer = keras_hub.models.T5Tokenizer.from_preset("t5_1.1_small")

Behaves equally to:

hf_tokenizer = transformers.AutoTokenizer.from_pretrained("google/t5-v1_1-small")
hf_model = transformers.T5ForConditionalGeneration.from_pretrained("google/t5-v1_1-small")

PCA on flattened outputs running on the same input:

Notes

The XXL version (11B params, 44GB for weights) is too large to run on consumer hardware. I can't run the conversion script on it. Getting XL weights up on Kaggle as soon as the download is finished.

/cc @divyashreepathihalli

divyashreepathihalli commented 3 weeks ago

@DavidLandup0 the GPU tests are failing, can you pleas take a look?

DavidLandup0 commented 3 weeks ago

@DavidLandup0 the GPU tests are failing, can you pleas take a look?

@divyashreepathihalli - Fixed - there was a missing Kaggle link for the XL preset

keras-team / keras-hub

[T5 1.1] Enable v1.1 Presets #1948

Numerical Equivalence

Notes