Request assistance for input shape

I am attempting to use this model architecture for some research. I have copy/pasted the code with no changes and when I attempt to send a rand tensor through the model, the q, k, v = torch.split line of the GatedAxialAttention class is throwing exceptions.

model = AxialLOB(W=40, H=40, c_in=4, c_out=4, c_final=32, n_heads=4, pool_kernel=(1,4), pool_stride=(1,4))
model.to(device=device)

input = torch.rand(1, 40, 40).unsqueeze(0) # results in (1, 1, 40, 40)
model(input.to(device=device))

throws the following exception:

Slightly length exception

### File paths altered for brevity ``` Python RuntimeError Traceback (most recent call last) Cell In[33], [line 2](vscode-notebook-cell:?execution_count=33&line=2) [1](vscode-notebook-cell:?execution_count=33&line=1) input = torch.rand(1, 40, 40).unsqueeze(0) ----> [2](vscode-notebook-cell:?execution_count=33&line=2) model(input.to(device=device)) File ...\envs\torch-gpu\lib\site-packages\torch\nn\modules\module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs) [1530](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1530) return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] [1531](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1531) else: -> [1532](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1532) return self._call_impl(*args, **kwargs) File ...\envs\torch-gpu\lib\site-packages\torch\nn\modules\module.py:1541, in Module._call_impl(self, *args, **kwargs) [1536](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1536) # If we don't have any hooks, we want to skip the rest of the logic in [1537](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1537) # this function, and just call forward. [1538](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1538) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks [1539](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1539) or _global_backward_pre_hooks or _global_backward_hooks [1540](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1540) or _global_forward_hooks or _global_forward_pre_hooks): -> [1541](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1541) return forward_call(*args, **kwargs) [1543](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1543) try: [1544](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1544) result = None File ...\model-store\deepLOB\deepLOB.py:56, in AxialLOB.forward(self, x) [53](.../model-store/deepLOB/deepLOB.py:53) y = self.activation(y) [55](.../model-store/deepLOB/deepLOB.py:55) #attention mechanism through gated multi head axial layer ---> [56](.../model-store/deepLOB/deepLOB.py:56) y = self.axial_width_1(y) [57](.../model-store/deepLOB/deepLOB.py:57) y = self.axial_height_1(y) [59](.../model-store/deepLOB/deepLOB.py:59) #lower branch File ...\envs\torch-gpu\lib\site-packages\torch\nn\modules\module.py:1532, in Module._wrapped_call_impl(self, *args, **kwargs) [1530](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1530) return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] [1531](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1531) else: -> [1532](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1532) return self._call_impl(*args, **kwargs) File ...\envs\torch-gpu\lib\site-packages\torch\nn\modules\module.py:1541, in Module._call_impl(self, *args, **kwargs) [1536](.../envs/torch-gpu/lib/site-packages/torch/nn/modules/module.py:1536) # If we don't have any hooks, we want to skip the rest of the logic in ... --> [146](.../model-store/deepLOB/deepLOB.py:146) q, k, v = torch.split(qkv.reshape(N * W, self.heads, self.dim_head_v * 2, H), [self.dim_head_v // 2, self.dim_head_v // 2, self.dim_head_v], dim=2) [148](.../model-store/deepLOB/deepLOB.py:148) # Calculate position embedding [149](.../model-store/deepLOB/deepLOB.py:149) all_embeddings = torch.index_select(self.relative, 1, self.flatten_index).view(self.dim_head_v * 2, self.dim, self.dim) RuntimeError: shape '[40, 4, 2, 40]' is invalid for input of size 6400 ```

This is a straight copy/paste from the ipynb file on the repo so:

Am I missing something with the input shape?
Has anyone else run into this issue?

What I have elimiated so far:

There are no edits to the repo coded version
Model and tensors are on the same device
No stale kernels

Any suggestions would be greatly appreciated.

LeonardoBerti00 / Axial-LOB-High-Frequency-Trading-with-Axial-Attention

Request assistance for input shape #2