Helsinki-NLP / OpusFilter

OpusFilter - Parallel corpus processing toolkit
MIT License
102 stars 18 forks source link

Process Killed #46

Closed bayesrule closed 2 years ago

bayesrule commented 2 years ago

Hi,

  1. I'm running the following step, tried twice. Both end up with process "killed" (the 2nd attempt already got downloaded files, so no download was skipped). Any suspected reason? RAM 32 GB, not enough memory?
  2. Why https://opus.nlpl.eu/ParaCrawl.php showed v9 in title but can't get v9 version. The actual latest is v8.

Thanks!

common:

output_directory: CCMatrix_de-en

steps:

image image

svirpioj commented 2 years ago
  1. This is possibly the same issue as https://github.com/Helsinki-NLP/OpusTools/issues/32. I tried to run the step, and indeed it's taking a lot of memory. (I killed the process at 15G before it started swapping.)
  2. Cannot replicate this, downloading ParaCrawl v9 works fine for me both with OpusFilter and OpusTools.
bayesrule commented 2 years ago

@svirpioj many thanks for answering! For 1, looks like no other choice, I've changed to use ParaCrawl. For 2, v9 can be downloaded now, thanks!