This PR includes two changes (1) moves model architecture definitions to a separate archs.py per model and explicitly registers them in __init__.py instead of relying on implicit import statements (2) refactors NLLB model definition and defines a vanilla Transformer model that NLLB is now based on (i.e. load_transformer_model() instead of load_nllb_model()). In order to preserve backwards compatibility load_nllb_model() still exists though.
This PR includes two changes (1) moves model architecture definitions to a separate
archs.py
per model and explicitly registers them in__init__.py
instead of relying on implicit import statements (2) refactors NLLB model definition and defines a vanilla Transformer model that NLLB is now based on (i.e.load_transformer_model()
instead ofload_nllb_model()
). In order to preserve backwards compatibilityload_nllb_model()
still exists though.