Closed albbas closed 12 years ago
Date: 2012-07-19 22:04:06 +0200
From: Ciprian Gerstenberger <
There are
parallel_corpus_tmp>ls xxx2yyy/nob | wc -l 67
file pares in the converted corpus that can be successfully used for enriching the tmx corpus but that lack translation information, i.e., the information is neither in one nor in the other file of the pair.
For the whole list see biggies/forvaltningsordbok/second_run/xxx2yyy.tx
Date: 2012-08-10 10:26:09 +0200
From: Ciprian Gerstenberger <
Almost done by Børre, excellent!
only one pair is left now:
freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl
Date: 2012-08-31 11:14:11 +0200
From: Børre Gaup <
(In reply to comment #1)
Almost done by Børre, excellent!
only one pair is left now:
freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl
How do you find docs that lack translation information?
Date: 2012-08-31 11:48:34 +0200
From: Ciprian Gerstenberger <
As far as I know I used the parallel_corpus_info.xsl from the .../scripts/corpus dir.
There is quite a lot of overhead but that time I was learning myself a bit more xslt stuff. Running as usual: java -Xmx2048m net.sf.saxon.Transform -it main parallel_corpus_info.xsl inDir=converted
Just tell me if something is not ok or ununderstandable.
(In reply to comment #2)
(In reply to comment #1)
Almost done by Børre, excellent!
only one pair is left now:
freecorpus>vi orig/nob/facta/skuvlahistorja1/ssh1-n.htm.xsl freecorpus>vi orig/sme/facta/skuvlahistorja1/ssh1-s.htm.xsl
How do you find docs that lack translation information?
Date: 2012-09-01 21:05:45 +0200
From: Ciprian Gerstenberger <
I started a local conversion process of only orig/nob and orig/sme and this is the result of underspecified translation direction. (However, I saw that you moved a considerable number of files from the sme- to the mixed-dir. I will rerun the whole.)
parallel_corpus_tmp>ls xxx2yyy/* xxx2yyy/nob: 22-juli-kommisjonens-rapport-.html_id=697509.xml gratulerer-med-verdens-urfolksdag.html_id=697230.xml ssh1-n.htm.xml takket-mood-for-innsatsen-i-syria.html_id=697002.xml tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml tale-ved-minnekonsert.html_id=696944.xml tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml
xxx2yyy/sme: giittii-mood-syria-barggu-ovddas.html_id=697002.xml sardni-aufs-lagideamis-utoyas.html_id=696942.xml sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml savvat-buori-algoalbmotbeaivvi.html_id=697230.xml ssh1-s.htm.xml suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml
Date: 2012-09-03 01:05:16 +0200
From: Ciprian Gerstenberger <
After the last converstion, it seems that there is a file pair more with underspecified translation direction:
xxx2yyy>ls * nob: 22-juli-kommisjonens-rapport-.html_id=697509.xml gratulerer-med-verdens-urfolksdag.html_id=697230.xml speech-at-the-ceremony-in-the-government.html_id=696940.xml ssh1-n.htm.xml takket-mood-for-innsatsen-i-syria.html_id=697002.xml tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml tale-ved-minnekonsert.html_id=696944.xml tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml
sme: giittii-mood-syria-barggu-ovddas.html_id=697002.xml sardni-aufs-lagideamis-utoyas.html_id=696942.xml sardni-kransabidjamis.html_id=696940.xml sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml savvat-buori-algoalbmotbeaivvi.html_id=697230.xml ssh1-s.htm.xml suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml xxx2yyy>ls nob | wc -l 8 xxx2yyy>ls sme | wc -l 8
Date: 2012-09-05 09:51:34 +0200
From: Ciprian Gerstenberger <
The latest tests wrt underspecified translation direction: parallel_corpus_tmp>ls xxx2yyy/nob/ | wc -l 10 parallel_corpus_tmp>ls xxx2yyy/sme/ | wc -l 10
parallel_corpus_tmp>ls xxx2yyy/nob/* xxx2yyy/nob/22-juli-kommisjonens-rapport-.html_id=697509.xml xxx2yyy/nob/beredskapstiltak.html_id=698072.xml xxx2yyy/nob/gratulerer-med-verdens-urfolksdag.html_id=697230.xml xxx2yyy/nob/speech-at-the-ceremony-in-the-government.html_id=696940.xml xxx2yyy/nob/ssh1-n.htm.xml xxx2yyy/nob/takket-mood-for-innsatsen-i-syria.html_id=697002.xml xxx2yyy/nob/tale-ved-aufs-arrangement-22-juli-2012.html_id=696942.xml xxx2yyy/nob/tale-ved-minnekonsert.html_id=696944.xml xxx2yyy/nob/tale-ved-samling-for-etterlatte-parorend.html_id=696943.xml xxx2yyy/nob/utlysning---proveordning-med-tilskudd-ti.html_id=698048.xml
parallel_corpus_tmp>ls xxx2yyy/sme/* xxx2yyy/sme/almmuhus--geahalanortnet-mas-dorjot-kult.html_id=698048.xml xxx2yyy/sme/giittii-mood-syria-barggu-ovddas.html_id=697002.xml xxx2yyy/sme/sardni-aufs-lagideamis-utoyas.html_id=696942.xml xxx2yyy/sme/sardni-kransabidjamis.html_id=696940.xml xxx2yyy/sme/sardni-oamehasaide-ja-eaktodahtolaaide.html_id=696943.xml xxx2yyy/sme/sardni-raeviessoilju-muitokonsearttas.html_id=696944.xml xxx2yyy/sme/savvat-buori-algoalbmotbeaivvi.html_id=697230.xml xxx2yyy/sme/ssh1-s.htm.xml xxx2yyy/sme/stahtaministtar-almmuha-oa-gearggusvuoad.html_id=698072.xml xxx2yyy/sme/suoidnemanu-22beaivve-kommiuvnna-raporta.html_id=697509.xml
==> Now, there are two pairs more than before!
Date: 2012-09-10 14:54:00 +0200
From: Berit Nystad Eskonsipo <
(In reply to comment #6) Translation direction is now correct for the files mentioned in comment #6. This bug will now be closed.
This issue was created automatically with bugzilla2github
Bugzilla Bug 1392
Date: 2012-07-19T22:04:06+02:00 From: Ciprian Gerstenberger <>
To: Berit Nystad Eskonsipo <>
CC: ciprian.gerstenberger, sjur.n.moshagen, trond.trosterud
Last updated: 2012-09-10T14:56:34+02:00