sillsdev / silnlp

A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.
Other
35 stars 3 forks source link

Fix for #540, issues (+ support for multiple translations) with exper… #544

Closed benjaminking closed 1 month ago

benjaminking commented 1 month ago

This is a fix for #540, the brackets issue that Matthew found. It was caused by a mismatch in types between the output from Huggingface's pipeline and the Python typing hints when passing tokenized inputs to the pipeline. (The types are correct when it is given untokenized inputs) This mismatch only shows up when running python -m silnlp.nmt.experiment --test.

This commit fixes that issue by restructuring the output format in the case of tokenized inputs to be identical to the untokenized case. It also adds support for multiple translations when running python -m silnlp.nmt.experiment --test.


This change is Reviewable

benjaminking commented 1 month ago

I've added a fix for the first issue.

For the second, it would be possible to leave the output format unchanged in the case when the user doesn't include --multiple-translations, but I had initially decided against it, since I wasn't sure if having two different output formats would be a good choice either.