This pull request introduces significant changes to the Transformer-based modules within the bert_squeeze package. Refactoring was performed to streamline model instantiation, and new features were added for better integration of encoder-decoder models, especially custom ones.
Changelog
Features
Extend BaseEncoderDecoderModel to support VisionEncoderDecoder, allowing initialization from custom encoder and decoder or pre-trained vision models.
Implement EncoderDecoderModel class to allow loading of pre-trained transformer checkpoints and provide methods for replacing encoder and decoder instances.
Refactors
Introduce BASE_CLASS_MODEL for better instantiation of Transformer-based modules and remove the need for _build_model method, hence simplifying the model architecture.
Remove unnecessary @overrides decorators from model classes.
Pass an optional instantiated model to BaseTransformerModule constructors for enhanced flexibility.
Modified configuration files to point to the correct targets after structural changes.
Simplify the forward method and add generate methods for BaseEncoderDecoderModel.
Introduce an optional model parameter in the SimpleT5Model constructor for more streamlined instantiation.
Test Updates
Update existing tests to adhere to the new model structure.
Add a new test case for the T5 summarization with a custom BaseEncoderDecoderModel.
Description
This pull request introduces significant changes to the Transformer-based modules within the
bert_squeeze
package. Refactoring was performed to streamline model instantiation, and new features were added for better integration of encoder-decoder models, especially custom ones.Changelog
Features
BaseEncoderDecoderModel
to supportVisionEncoderDecoder
, allowing initialization from custom encoder and decoder or pre-trained vision models.EncoderDecoderModel
class to allow loading of pre-trained transformer checkpoints and provide methods for replacing encoder and decoder instances.Refactors
BASE_CLASS_MODEL
for better instantiation of Transformer-based modules and remove the need for_build_model
method, hence simplifying the model architecture.@overrides
decorators from model classes.BaseTransformerModule
constructors for enhanced flexibility.forward
method and addgenerate
methods forBaseEncoderDecoderModel
.SimpleT5Model
constructor for more streamlined instantiation.Test Updates
T5
summarization with a customBaseEncoderDecoderModel
.