Closed DavidLandup0 closed 3 weeks ago
@DavidLandup0 the GPU tests are failing, can you pleas take a look?
@DavidLandup0 the GPU tests are failing, can you pleas take a look?
@divyashreepathihalli - Fixed - there was a missing Kaggle link for the XL preset
Turns out we already operationally supported T5 1.1 (given the gated activations) but only supported vanilla T5 models through weight conversion and presets.
This PR updates the conversion script to include the T5 1.1 variants:
For example:
It also updates the conversion to use the
save_to_preset()
functionality and fixes assertions that raised exceptions, and saves the tokenizer as well.Numerical Equivalence
Behaves equally to:
PCA on flattened outputs running on the same input:
Notes
The XXL version (11B params, 44GB for weights) is too large to run on consumer hardware. I can't run the conversion script on it. Getting XL weights up on Kaggle as soon as the download is finished.
/cc @divyashreepathihalli