But the network continously reports error if you try to add a batch size to the input, e.g.:
x = torch.randn(32, 4, 256, 128).to("cuda") # (where 32 is the batch size)
You get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/xlstm/blocks/mlstm/layer.py", line 102, in forward
B, S, _ = x.shape
ValueError: too many values to unpack (expected 3)
In your case it is a backbone processing a single tensor.
Is it possible to process something like this:
if __name__ == "__main__":
# Define model hyperparameters
input_dim = 6
hidden_dim = 128
output_dim = 1
num_layers = 2
context_length = 10
# Instantiate the model
model = xLSTM(input_dim, hidden_dim, output_dim, num_layers, context_length).to('cuda')
# Print the model structure
print(model)
# Example dummy input (batch_size=32, sequence_length=10, input_dim=6)
dummy_input = torch.randn(32, context_length, input_dim).to('cuda')
# Forward pass through the model
output = model(dummy_input)
print(output.shape)
Where you have 6 inputs, the h_dim of the network is 128 (for example), output dim is 1, and the context length is 10? Obviously 32 represents the batch size.
If I run that code, I get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/torch/nn/functional.py", line 2573, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[128], expected input with shape [*, 128], but got input of size[32, 10, 6]
@Cram3r95 I think you have the wrong approach here, the size 4 in your example above is already considered the batch size, as the heads are only internal and not exposed.
This code is working:
But the network continously reports error if you try to add a batch size to the input, e.g.:
x = torch.randn(32, 4, 256, 128).to("cuda") # (where 32 is the batch size)
You get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/xlstm/blocks/mlstm/layer.py", line 102, in forward B, S, _ = x.shape ValueError: too many values to unpack (expected 3)
In your case it is a backbone processing a single tensor.
Is it possible to process something like this:
Where you have 6 inputs, the h_dim of the network is 128 (for example), output dim is 1, and the context length is 10? Obviously 32 represents the batch size.
If I run that code, I get the following error:
File "/home/carlosgomezh/.local/lib/python3.10/site-packages/torch/nn/functional.py", line 2573, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: Given normalized_shape=[128], expected input with shape [*, 128], but got input of size[32, 10, 6]
@kpoeppel @maximilianmbeck