NX-AI / xlstm

Official repository of the xLSTM.
GNU Affero General Public License v3.0
918 stars 66 forks source link

'sLSTMLayer' object has no attribute 'step' #20

Open hiimbach opened 2 weeks ago

hiimbach commented 2 weeks ago

Hi, thank so much for your work.

I am testing the model on multiple configs. While using step() method to get both the output and the states, I observed that models with sLSTM layer does not have method step. Instead, to get the state, we must use the argument return_last_state=True. This causes the xLSTM Language model cannot get state. This is my code used:

from omegaconf import OmegaConf
import torch
from dacite import from_dict
from dacite import Config as DaciteConfig
from xlstm import xLSTMLMModel, xLSTMLMModelConfig

xlstm_cfg = """ 
vocab_size: 50304
mlstm_block:
  mlstm:
    conv1d_kernel_size: 4
    qkv_proj_blocksize: 4
    num_heads: 4
slstm_block:
  slstm:
    backend: vanilla
    num_heads: 4
    conv1d_kernel_size: 4
    bias_init: powerlaw_blockdependent
  feedforward:
    proj_factor: 1.3
    act_fn: gelu
context_length: 256
num_blocks: 7
embedding_dim: 128
slstm_at: [1]
"""
cfg = OmegaConf.create(xlstm_cfg)
cfg = from_dict(data_class=xLSTMLMModelConfig, data=OmegaConf.to_container(cfg), config=DaciteConfig(strict=True))
model = xLSTMLMModel(cfg)

x = torch.randint(0, 50304, size=(4, 256)).to("cpu")
model = model.to("cpu")

model.step(torch.Tensor([1]).unsqueeze(dim=0).long())

This is the error: AttributeError: 'sLSTMLayer' object has no attribute 'step'

I wonder can you change the method in the sLSTM for analogousness. Thank you so much.

liujike commented 1 week ago

which is your version?