numenta / nupic.research

Experimental algorithms. Unsupported.
https://nupicresearch.readthedocs.io
GNU Affero General Public License v3.0
107 stars 60 forks source link

Sparse Transformers: new models, updated checkpoints, and new configs. #470

Closed mvacaporale closed 3 years ago

mvacaporale commented 3 years ago

Per the new models

Per the checkpoints:

Per the configs:

Example with register_bert_model:

@register_bert_model
class SparseBertModel(BertModel):

    @dataclass
    class ConfigKWargs:
        # Keyword arguments to configure sparsity.
        sparsity: float = 0.9

    # Define __init__, ect.
    ...

This will automatically create new classes called SparseBertConfig, SparseBertForMaskedLM, and SparseBertForSequenceClassification. Notice that the naming is automatic and is a function of the name of your original class. For instance, if you define DynamicSparseBertModel, you'd get a class named DynamicSparseBertConfig and so on.

As soon as you define the class, it's ready to autoload. For instance, you could do

config = AutoConfig.for_model(model_type="sparse_bert", sparsity=0.5)
model = AutoModelForMaskedLM.from_config(model)

type(model)
>>> SparseBertModelForMaskedLM

Notice how the model_type of "sparse_bert" has also been automatically formatted. In the other example, you'd use model_type="dynamic_sparse_bert". As well, the config is already equipped to accept the argument sparsity which can be accessed by your model. Thus, you can run

config.sparsity
>>> 0.5

This comes from the ConfigKWargs defined above. You can add whatever arguments you want to that dataclass. With this, we can modify our experiment configs and configure our models as desired.