marian-nmt / marian-dev

Fast Neural Machine Translation in C++ - development repository
https://marian-nmt.github.io
Other
247 stars 123 forks source link

[Feature Request] Decoder-only Marian models #989

Open alvations opened 1 year ago

alvations commented 1 year ago

Feature description

Given the GPT / PALM / BLOOM popularity, having marian benchmarks for decoder-only models would be good.

It is not that I think it will give better ChrF (maybe it will) but comparing marian-nmt models to other libraries models makes it a little tough to explain about "hyperparameters equivalence" and quirks in layer implementation.

Example

I'vent seen anyone coding from scratch in C++, but OpenNMT's CTranslate went the "converter" route https://github.com/OpenNMT/CTranslate2/blob/master/python/ctranslate2/converters/transformers.py