keeleinstituut / tv-tolkevarav

Tõlkevärav (Translation Hub)
1 stars 0 forks source link

Checking why xml is not converted to xliff (BE) #669

Open MariusJulius opened 9 months ago

MariusJulius commented 9 months ago

the analysis does not work because the source file is XML and does not convert to XLIFF either; The XML source file from Riigi Teaja is necessary, in the future it will be the XLM type they use as the source file, I guess it is necessary to find out what is wrong with the data structure in the XML

vana-VÕS.xml.zip

NB! it is not always the case - same file doesn't worked directly in Matecat as well

plakitkelly commented 7 months ago
  1. I can't analyze vana-VÕS.xsml file.

{ "data": { "setup_status": "IN_PROGRESS", "analyzing_status": "NOT_STARTED", "cat_jobs": [] } }

  1. I tried my short xml file - works
  2. I also tried other xml from riigiteataja.ee - works
  3. I downloaded another VÕS from riigiteataja - doesn't work

For me it seems that system can't handle long files. My 2nd file is short. In 3rd file is 1469 rows - they worked. But when I tried with 1st and 4th files (võs), in those file have more than 48 000 rows.

I have had problems before with liiklusseadus.txt file. It finished the generating and I was able to open in the translator, but couldn't split the xliff - Not possible to split the project that has analysis in progress

kadmit commented 6 months ago

It can be related to the https://github.com/keeleinstituut/tv-tolkevarav/issues/672, let's recheck it after deployment of the changes

MariusJulius commented 5 months ago

Testing blocked by: #744

MariusJulius commented 5 months ago

@NeleKo tested this vana-võs -not working. Error also happens when trying to add directly on matecat website. Can we communicate that in some cases xml-s don't work? as it is not our code related issue, but on matecat side. tested äriseadustik it worked.

MariusJulius commented 4 months ago

Can't support this specific xml as it is not supported by matecat as well. Can't give any estimation for that.