Helsinki-NLP / OPUS-CAT

OPUS-CAT is a collection of software which make it possible to OPUS-MT neural machine translation models in professional translation. OPUS-CAT includes a local offline MT engine and a collection of CAT tool plugins.
MIT License
71 stars 11 forks source link

OPUS-CAT Trados plugin doesnt support inline tags #12

Closed mikethetexan closed 7 months ago

mikethetexan commented 3 years ago

image

See the comparison of the Opus CAT plugin vs the DeepL or Google plugin. Not supporting inline tags is a big drawback for technical documentation (such as DITA based) which is typically full of them.

TommiNieminen commented 3 years ago

You're right, it's a drawback, and tag insertion is one of the priorities for future development. There is actually already a feature in the fine-tuning functionality which can be used to automatically insert tags in Studio. The feature is very simplistic, as it assumes that tags are in the same order in the source as they are in the target, but it's still useful with many documents.

When you fine-tune, there are these two checkboxes:

kuva

If you check those boxes, the fine-tuning functionality will include tags as text in the fine-tuning bitext, which means that the fine-tuned model will be trained to place tags present in the source segment to the target segment. I just tested that the feature works with TMX fine-tuning, and seems to have an effect even when trained with just 2000 sentence pairs (but more the better):

kuva

I actually added this feature a long time ago for a specific translation job where there was a lot of tags (I do freelance translation when I can find the time), and it was very useful for that job. Once I have the time, I'll develop this into a more sophisticated tag insertion method.

TommiNieminen commented 7 months ago

Inline tag restoration has been added to the Trados plugin (long time ago, actually)