Closed xiamengzhou closed 5 years ago
Hi. You can download the data from the shared task webpage http://www.statmt.org/wmt19/parallel-corpus-filtering.html
Thanks! But it's like the common crawl monolingual data for sin and nep is not provided in the shared task webpage?
The commoncrawl data links have now been updated in the shared task webpage http://www.statmt.org/wmt19/parallel-corpus-filtering.html
Hi, I'm having trouble decompressing ”commoncrawl.deduped.en.xz“.
unxz: commoncrawl.deduped.en.xz: Unexpected end of input
I can decompress other files. Is there anything wrong with the file?
Hi, I don't find any access to get the monolingual data used in the paper. Is there anyway I can access those?