kyegomez / zeta

Build high-performance AI models with modular building blocks
https://zeta.apac.ai
Apache License 2.0
391 stars 37 forks source link

[BUG] no default value for depth of AttentionLayer #147

Closed evelynmitchell closed 5 months ago

evelynmitchell commented 7 months ago

Describe the bug

ERROR tests/structs/test_transformer.py::test_creation - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x0-expected_output_size0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x1-expected_output_size1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x2-expected_output_size2] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[string] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'

AttentionLayer has depth, but it doesn't have a default value:

class AttentionLayers(nn.Module):
    def __init__(
        self,
        dim,
        depth,
        heads=8,
        causal=False,
        cross_attend=False,
        only_cross=False,
        use_scalenorm=False,
        use_rmsnorm=False,
        use_simple_rmsnorm=False,
        alibi_pos_bias=False,
        alibi_num_heads=None,
        rel_pos_bias=False,
        rel_pos_num_buckets=32,
        rel_pos_max_distance=128,
        dynamic_pos_bias=False,
        dynamic_pos_bias_log_distance=False,
        dynamic_pos_bias_mlp_depth=2,
        dynamic_pos_bias_norm=False,
        rotary_pos_emb=False,
        rotary_emb_dim=None,
        rotary_xpos=False,
        rotary_interpolation_factor=1.0,
        rotary_xpos_scale_base=512,
        rotary_base_rescale_factor=1.0,
        custom_layers=None,
        sandwich_coef=None,
        par_ratio=None,
        residual_attn=False,
        cross_residual_attn=False,
        macaron=False,
        pre_norm=True,
        pre_norm_has_final_norm=True,
        gate_residual=False,
        scale_residual=False,
        scale_residual_constant=1.0,
        deepnorm=False,
        shift_tokens=0,
        sandwich_norm=False,
        resi_dual=False,
        resi_dual_scale=1.0,
        zero_init_branch_output=False,
        layer_dropout=0.0,
        cross_attn_tokens_dropout=0.0,
        **kwargs,
    ):

So, when it's called with only a single value(test_transformer), there's an error:

@pytest.fixture()
def init_transformer():
    attn_layers = AttentionLayers(
        256
    )

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

Upvote & Fund

Fund with Polar

github-actions[bot] commented 5 months ago

Stale issue message