About share weight between embeddings

hyunwoongko / transformer

Transformer: PyTorch Implementation of "Attention Is All You Need"

2.84k stars 424 forks source link

Open TIPTOEHIGHERZ opened 1 month ago

TIPTOEHIGHERZ commented 1 month ago

In the original paper it says that the two embedding layers share weights, but I fail to find any implemention about share weight in the code.

TIPTOEHIGHERZ commented 1 month ago