Helsinki-NLP / OpusTools

67 stars 17 forks source link

Using opus_read with -az, -sz, -tz options #26

Closed pluiez closed 3 years ago

pluiez commented 3 years ago

Hi, I use opus_get -s ar -t en -d TED2013 --list to list the relevant files and download them manually. Then I run opus_read -s ar -t en -af ar-en.xml.gz -sz ar.zip -tz en.zip -d output, but it says:

There is no item named 'ar/ted2013.en-ar.xml.gz' in the archive 'ar.zip'
Continuing from next sentence file pair.

How should I retrieve plain texts from the downloaded files?

pluiez commented 3 years ago

The correct command should be opus_read -s ar -t en -af ar-en.xml.gz -sz ar.zip -tz en.zip -d CORPUS_NAME