[BUG] no default value for depth of AttentionLayer

Describe the bug

ERROR tests/structs/test_transformer.py::test_creation - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x0-expected_output_size0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x1-expected_output_size1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x2-expected_output_size2] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[string] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'

AttentionLayer has depth, but it doesn't have a default value:

class AttentionLayers(nn.Module):
    def __init__(
        self,
        dim,
        depth,
        heads=8,
        causal=False,
        cross_attend=False,
        only_cross=False,
        use_scalenorm=False,
        use_rmsnorm=False,
        use_simple_rmsnorm=False,
        alibi_pos_bias=False,
        alibi_num_heads=None,
        rel_pos_bias=False,
        rel_pos_num_buckets=32,
        rel_pos_max_distance=128,
        dynamic_pos_bias=False,
        dynamic_pos_bias_log_distance=False,
        dynamic_pos_bias_mlp_depth=2,
        dynamic_pos_bias_norm=False,
        rotary_pos_emb=False,
        rotary_emb_dim=None,
        rotary_xpos=False,
        rotary_interpolation_factor=1.0,
        rotary_xpos_scale_base=512,
        rotary_base_rescale_factor=1.0,
        custom_layers=None,
        sandwich_coef=None,
        par_ratio=None,
        residual_attn=False,
        cross_residual_attn=False,
        macaron=False,
        pre_norm=True,
        pre_norm_has_final_norm=True,
        gate_residual=False,
        scale_residual=False,
        scale_residual_constant=1.0,
        deepnorm=False,
        shift_tokens=0,
        sandwich_norm=False,
        resi_dual=False,
        resi_dual_scale=1.0,
        zero_init_branch_output=False,
        layer_dropout=0.0,
        cross_attn_tokens_dropout=0.0,
        **kwargs,
    ):

So, when it's called with only a single value(test_transformer), there's an error:

@pytest.fixture()
def init_transformer():
    attn_layers = AttentionLayers(
        256
    )

To Reproduce Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

kyegomez / zeta

[BUG] no default value for depth of AttentionLayer #147

Upvote & Fund