ml-jku / hopfield-layers

Hopfield Networks is All You Need
https://ml-jku.github.io/hopfield-layers/
Other
1.69k stars 189 forks source link

I got an error when i tried to figure out how to use HopfieldDecoderLayer #7

Closed luyifanlu closed 4 years ago

luyifanlu commented 4 years ago

Using torch.nnTransformerDecoder as a example , the code as fellow. import torch from torch import nn `

decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)

decoder_layer = HopfieldDecoderLayer(Hopfield(512,num_heads=8),Hopfield(512,num_heads=8)) memory = torch.rand(10, 32, 512) tgt = torch.rand(20, 32, 512) m_mask = torch.rand(20, 10) out = decoder_layer(tgt, memory,memory_mask = m_mask) `

it will be wrong when HopfieldDecoderLayer insteads of TransformerDecoderLayer

luyifanlu commented 4 years ago

it is amazing work that this project。it would be perfect if there are some examples like miniGPT and miniBERT.

roholazandie commented 4 years ago

This code is very unreadable and needs a huge refactoring. I don't know but probably we have to start doing it if they don't. i tried this code and it works. You got the wrong orders

decoder_layer = HopfieldDecoderLayer(Hopfield(512, num_heads=8), Hopfield(512, num_heads=8))
memory = torch.rand(10, 20, 512) # (N, L, E) Batchsize x length x embedding_size
tgt = torch.rand(10, 32, 512) # (N, L, E) Batchsize x length x embedding_size
m_mask = torch.rand(32, 20)
out = decoder_layer(tgt, memory, memory_mask = m_mask)
print(out)
luyifanlu commented 4 years ago

I'm sorry for that this code is very unreadable. it can be reading now

luyifanlu commented 4 years ago

It is same different that the shape of memory form your code ,the suggestion shape in hopfield_core_forward function is ( S, N, E). Did i get samething wrong with this

roholazandie commented 4 years ago

I mean the source code of the repository, not yours. You should pay attention to the fact that the first size in memory and tgt is batch size. I chose the batch size of 10. and two sequences of the sizes 20 and 32 so the mask will be 32x20

luyifanlu commented 4 years ago

I get it . so it is diffferent from TransformerDecoder that second size in memory and tgt is batch size.

bschaefl commented 4 years ago

Thanks for your interest in our work!

By default, the first dimension is the batch size, as seen in the Hopfield initializer https://github.com/ml-jku/hopfield-layers/blob/79162bee601f861befe17aea97dd1b0c12c6f465/modules/__init__.py#L41

and the corresponding docstring https://github.com/ml-jku/hopfield-layers/blob/79162bee601f861befe17aea97dd1b0c12c6f465/modules/__init__.py#L73

I assume the issue to be solved, after supplying the appropriate initializer argument.