kyegomez / BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
https://discord.gg/qUtxnK2NMf
MIT License
1.69k stars 155 forks source link

[BUG] NoneType in sequential module in bit_ffn #43

Closed jayUyang closed 6 months ago

jayUyang commented 8 months ago

self.ff sequential modules could have None, which is not callable, if post_act_ln is False.

[suggenstion]

    ff_layers = [project_in]
    if post_act_ln:
        ff_layers.append(nn.LayerNorm(inner_dim))
    ff_layers.append(nn.Dropout(dropout))
    ff_layers.append(BitLinear(inner_dim, dim_out, bias=not no_bias, *args, **kwargs))
    self.ff=nn.Sequential(*ff_layers)

Upvote & Fund

Fund with Polar

github-actions[bot] commented 6 months ago

Stale issue message