NX-AI / xlstm

Official repository of the xLSTM.
GNU Affero General Public License v3.0
918 stars 66 forks source link

How to use xlstm when the context length is not fixed? Or must it be fixed? #1

Open yxchng opened 4 weeks ago

alexdemartos commented 3 weeks ago

The context_length is used for the causal mask in the mLSTM block. By looking at the code here:

    if lower_triangular_matrix is None or S < lower_triangular_matrix.size(-1):
        ltr = torch.tril(torch.ones((S, S), dtype=torch.bool, device=_device))

you can just set context_length to your maximum sequence length.