keeleinstituut / tv-tolkevarav

Tõlkevärav (Translation Hub)
1 stars 0 forks source link

Can't send some file formats to cat tool (BE) #692

Open plakitkelly opened 6 months ago

plakitkelly commented 6 months ago

322 was ok, but can't generate some files anymore

Allowed file formats Can't click on Genereeri tõlkimiseks -> Error: Project conversion failure

kadmit commented 5 months ago

@MariusJulius I did the investigation and found out the possible reason why some formats are not supported.

MateCat contains dependency with so-called MateCat Filters. This part is responsible for converting files from their formats to .xliff that MateCat can handle to translate. I found out that we using the public version of MateCat filters which is outdated https://github.com/matecat/MateCat-Filters and I think that's the reason why some file formats are not supported in our system.

The new version of MateCat filters that support more formats is not available to the public, so we can't install and use it as we did with the old version. Instead, MateCat proposes to use their paid API that can be easily integrated into the already existing solution: https://filters.matecat.com.

One of the possible solutions is to add the MateCat Win Converter to start supporting more file formats: https://github.com/matecat/MateCat-Win-Converter but it's also outdated and works in the way that it uses third-party paid API to convert files (https://cloudconvert.com/pdf-converter).

Link to MateCat installation guide related to the filters: https://site.matecat.com/installation-guide#filters

MariusJulius commented 5 months ago

@PlaksoBirgit @NeleKo we have an issue with the formats, would need to clarify.

What can be done?

  1. Matecat WIN converter (github, old) converter to support doc etc (doesn't support pdf without paid cloud converter API -security?) - devops work, no need to code. @thenouan ?
  2. easiest for MVP paid Matecat filters API of matecat filters (integration with cloud converter, available by default) - no setup time, just configuration.
MariusJulius commented 5 months ago

doc and xls are important. PDF and other not relevant.

MariusJulius commented 4 months ago

Won't do and decision: