Why Transformer Model is not get the result in parallel？

ruotianluo / ImageCaptioning.pytorch

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

MIT License

1.44k stars 412 forks source link

Open kamille-hand opened 3 years ago

kamille-hand commented 3 years ago

Why send the features to the TransformerModel and get the next word instead of geting the whole sentence in parallel like Bert？