Closed nottwy closed 6 years ago
This issue (https://github.com/marbl/canu/issues/838) is created by me and is the same problem. You can get more information from it .
The memory is probably not consumed by rr_ctg_track.py
directly. That program spawns LA4Falcon
for each .las file. You will have a number of instances of LA4Falcon running equal to your --n-core
argument. (And each will be under a different sub-process, so the memory used by rr_ctg_track will be cloned. That's probably not a problem, but you can look at the forked python procs on your machine.)
Each LA4Falcon loads the entire DAZZLER DB, which is probably your problem. (Look at the file 0-rawreads/.raw_reads.bps
) There are 2 solutions:
dev/shm
. (Non-trivial, but one user has done this.)--n-core=0
. (Same as --n-core=1
, but simpler, since it avoids the whole "multiprocessing" module.)You can experiment with various values of --n-core
.
Also, your unzip might be out-of-date. You could try the Falcon-unzip binary tarball, as the GitHub code is not up-to-date.
The explaination is really clear and I believe the solution you provided must be useful. I'll try as you said. 3ku.
Dear developer,
We found that unzip module 'rr_ctg_track.py' try to read all .las files into memory and we had around 20T .las files. It's hard to find a machine with so large memories. Do you have any suggestion to avoid loading all data into memories?
3ku!