lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers
MIT License
4.63k stars 395 forks source link

encoder.to_logits.weight doesn't update #273

Closed guillaumeguy closed 2 weeks ago

guillaumeguy commented 2 weeks ago

It seems that parameter encoder.to_logits.weight in an Encoder/Decoder is not part of the forward pass. This creates problems when running DDP.

Should these weights be removed in such case?

Reproducible example:

lucidrains commented 2 weeks ago

@guillaumeguy ah yes, that shouldn't be there

could you try 1.35.2?

guillaumeguy commented 2 weeks ago

Works great! Thank you for the quick turnaround!

lucidrains commented 2 weeks ago

@guillaumeguy you are one of the few using a full encoder decoder transformer

how are you using it?