konumaru / commonLit_readability_prize

https://www.kaggle.com/c/commonlitreadabilityprize
0 stars 0 forks source link

DataCollatorForLanguageModeling を試す #18

Closed konumaru closed 3 years ago

konumaru commented 3 years ago
konumaru commented 3 years ago
konumaru commented 3 years ago
class RobertaLMHead(nn.Module):
    """Roberta Head for masked language modeling."""

    def __init__(self, config):
        super().__init__()
        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
        self.layer_norm = nn.LayerNorm(config.hidden_size, eps=config.layer_norm_eps)

        self.decoder = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
        self.bias = nn.Parameter(torch.zeros(config.vocab_size))

        # Need a link between the two variables so that the bias is correctly resized with `resize_token_embeddings`
        self.decoder.bias = self.bias

    def forward(self, features, **kwargs):
        x = self.dense(features)
        x = gelu(x)
        x = self.layer_norm(x)

        # project back to size of vocabulary with bias
        x = self.decoder(x)

        return x
konumaru commented 3 years ago

https://huggingface.co/transformers/_modules/transformers/models/roberta/modeling_roberta.html#RobertaModel

konumaru commented 3 years ago

続きは #6 で