Given the GPT / PALM / BLOOM popularity, having marian benchmarks for decoder-only models would be good.
It is not that I think it will give better ChrF (maybe it will) but comparing marian-nmt models to other libraries models makes it a little tough to explain about "hyperparameters equivalence" and quirks in layer implementation.
Feature description
Given the GPT / PALM / BLOOM popularity, having marian benchmarks for decoder-only models would be good.
It is not that I think it will give better ChrF (maybe it will) but comparing marian-nmt models to other libraries models makes it a little tough to explain about "hyperparameters equivalence" and quirks in layer implementation.
Example
I'vent seen anyone coding from scratch in C++, but OpenNMT's CTranslate went the "converter" route https://github.com/OpenNMT/CTranslate2/blob/master/python/ctranslate2/converters/transformers.py