giellalt / bugzilla-dummy

0 stars 0 forks source link

Unspecified translation direction in the meta file in for the nob2sme or sme2nob (Bugzilla Bug 1392) #59

Closed albbas closed 12 years ago

albbas commented 12 years ago

This issue was created automatically with bugzilla2github

Bugzilla Bug 1392

Date: 2012-07-19T22:04:06+02:00 From: Ciprian Gerstenberger <> To: Berit Nystad Eskonsipo <> CC: ciprian.gerstenberger, sjur.n.moshagen, trond.trosterud

Last updated: 2012-09-10T14:56:34+02:00

albbas commented 12 years ago

Comment 6552

Date: 2012-07-19 22:04:06 +0200 From: Ciprian Gerstenberger <>

There are

parallel_corpus_tmp>ls xxx2yyy/nob | wc -l 67

file pares in the converted corpus that can be successfully used for enriching the tmx corpus but that lack translation information, i.e., the information is neither in one nor in the other file of the pair.

For the whole list see biggies/forvaltningsordbok/second_run/xxx2yyy.tx

albbas commented 12 years ago

Comment 6559

Date: 2012-08-10 10:26:09 +0200 From: Ciprian Gerstenberger <>

Almost done by Børre, excellent!

only one pair is left now:

freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl

albbas commented 12 years ago

Comment 6655

Date: 2012-08-31 11:14:11 +0200 From: Børre Gaup <>

(In reply to comment #1)

Almost done by Børre, excellent!

only one pair is left now:

freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl

How do you find docs that lack translation information?

albbas commented 12 years ago

Comment 6656

Date: 2012-08-31 11:48:34 +0200 From: Ciprian Gerstenberger <>

As far as I know I used the parallel_corpus_info.xsl from the .../scripts/corpus dir.

There is quite a lot of overhead but that time I was learning myself a bit more xslt stuff. Running as usual: java -Xmx2048m net.sf.saxon.Transform -it main parallel_corpus_info.xsl inDir=converted

Just tell me if something is not ok or ununderstandable.

(In reply to comment #2)

(In reply to comment #1)

Almost done by Børre, excellent!

only one pair is left now:

freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl

How do you find docs that lack translation information?

albbas commented 12 years ago

Comment 6663

Date: 2012-09-01 21:05:45 +0200 From: Ciprian Gerstenberger <>

I started a local conversion process of only orig/nob and orig/sme and this is the result of underspecified translation direction. (However, I saw that you moved a considerable number of files from the sme- to the mixed-dir. I will rerun the whole.)

parallel_corpus_tmp>ls xxx2yyy/* xxx2yyy/nob: 22-juli-kommisjonens-rapport-.html_id=697509.xml gratulerer-med-verdens-urfolksdag.html_id=697230.xml ssh1-n.htm.xml takket-mood-for-innsatsen-i-syria.html_id=697002.xml tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml tale-ved-minnekonsert.html_id=696944.xml tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml

xxx2yyy/sme: giittii-mood-syria-barggu-ovddas.html_id=697002.xml sardni-aufs-lagideamis-utoyas.html_id=696942.xml sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml savvat-buori-algoalbmotbeaivvi.html_id=697230.xml ssh1-s.htm.xml suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml

albbas commented 12 years ago

Comment 6668

Date: 2012-09-03 01:05:16 +0200 From: Ciprian Gerstenberger <>

After the last converstion, it seems that there is a file pair more with underspecified translation direction:

xxx2yyy>ls * nob: 22-juli-kommisjonens-rapport-.html_id=697509.xml gratulerer-med-verdens-urfolksdag.html_id=697230.xml speech-at-the-ceremony-in-the-government.html_id=696940.xml ssh1-n.htm.xml takket-mood-for-innsatsen-i-syria.html_id=697002.xml tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml tale-ved-minnekonsert.html_id=696944.xml tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml

sme: giittii-mood-syria-barggu-ovddas.html_id=697002.xml sardni-aufs-lagideamis-utoyas.html_id=696942.xml sardni-kransabidjamis.html_id=696940.xml sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml savvat-buori-algoalbmotbeaivvi.html_id=697230.xml ssh1-s.htm.xml suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml xxx2yyy>ls nob | wc -l 8 xxx2yyy>ls sme | wc -l 8

albbas commented 12 years ago

Comment 6688

Date: 2012-09-05 09:51:34 +0200 From: Ciprian Gerstenberger <>

The latest tests wrt underspecified translation direction: parallel_corpus_tmp>ls xxx2yyy/nob/ | wc -l 10 parallel_corpus_tmp>ls xxx2yyy/sme/ | wc -l 10

parallel_corpus_tmp>ls xxx2yyy/nob/* xxx2yyy/nob/22-juli-kommisjonens-rapport-.html_id=697509.xml xxx2yyy/nob/beredskapstiltak.html_id=698072.xml xxx2yyy/nob/gratulerer-med-verdens-urfolksdag.html_id=697230.xml xxx2yyy/nob/speech-at-the-ceremony-in-the-government.html_id=696940.xml xxx2yyy/nob/ssh1-n.htm.xml xxx2yyy/nob/takket-mood-for-innsatsen-i-syria.html_id=697002.xml xxx2yyy/nob/tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml xxx2yyy/nob/tale-ved-minnekonsert.html_id=696944.xml xxx2yyy/nob/tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml xxx2yyy/nob/utlysning---proveordning-med-tilskudd-ti.html_id=698048.xml

parallel_corpus_tmp>ls xxx2yyy/sme/* xxx2yyy/sme/almmuhus--geahalanortnet-mas-dorjot-kult.html_id=698048.xml xxx2yyy/sme/giittii-mood-syria-barggu-ovddas.html_id=697002.xml xxx2yyy/sme/sardni-aufs-lagideamis-utoyas.html_id=696942.xml xxx2yyy/sme/sardni-kransabidjamis.html_id=696940.xml xxx2yyy/sme/sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml xxx2yyy/sme/sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml xxx2yyy/sme/savvat-buori-algoalbmotbeaivvi.html_id=697230.xml xxx2yyy/sme/ssh1-s.htm.xml xxx2yyy/sme/stahtaministtar-almmuha-oa-gearggusvuoad.html_id=698072.xml xxx2yyy/sme/suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml

==> Now, there are two pairs more than before!

albbas commented 12 years ago

Comment 6742

Date: 2012-09-10 14:54:00 +0200 From: Berit Nystad Eskonsipo <>

(In reply to comment #6) Translation direction is now correct for the files mentioned in comment #6. This bug will now be closed.