Closed luyifanlu closed 4 years ago
it is amazing work that this project。it would be perfect if there are some examples like miniGPT and miniBERT.
This code is very unreadable and needs a huge refactoring. I don't know but probably we have to start doing it if they don't. i tried this code and it works. You got the wrong orders
decoder_layer = HopfieldDecoderLayer(Hopfield(512, num_heads=8), Hopfield(512, num_heads=8))
memory = torch.rand(10, 20, 512) # (N, L, E) Batchsize x length x embedding_size
tgt = torch.rand(10, 32, 512) # (N, L, E) Batchsize x length x embedding_size
m_mask = torch.rand(32, 20)
out = decoder_layer(tgt, memory, memory_mask = m_mask)
print(out)
I'm sorry for that this code is very unreadable. it can be reading now
It is same different that the shape of memory form your code ,the suggestion shape in hopfield_core_forward function is ( S, N, E). Did i get samething wrong with this
I mean the source code of the repository, not yours. You should pay attention to the fact that the first size in memory and tgt is batch size. I chose the batch size of 10. and two sequences of the sizes 20 and 32 so the mask will be 32x20
I get it . so it is diffferent from TransformerDecoder that second size in memory and tgt is batch size.
Thanks for your interest in our work!
By default, the first dimension is the batch size, as seen in the Hopfield initializer https://github.com/ml-jku/hopfield-layers/blob/79162bee601f861befe17aea97dd1b0c12c6f465/modules/__init__.py#L41
and the corresponding docstring https://github.com/ml-jku/hopfield-layers/blob/79162bee601f861befe17aea97dd1b0c12c6f465/modules/__init__.py#L73
I assume the issue to be solved, after supplying the appropriate initializer argument.
Using torch.nnTransformerDecoder as a example , the code as fellow.
import torch
from torch import nn
`decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)
it will be wrong when HopfieldDecoderLayer insteads of TransformerDecoderLayer