Open shenwei356 opened 3 years ago
ah, I was assuming it was the parallel
feature that was causing issues, but it is numpy
trying to parallelize... something. Of note, this happens during sourmash index
, not sourmash sketch
.
Can you check if setting these solve the problem?
export MKL_NUM_THREADS=1
export NUMEXPR_NUM_THREADS=1
export OMP_NUM_THREADS=1
(and, even tho the Rust parallel
feature is not enable, you can also set RAYON_NUM_THREADS=1
just to be safe)
Sorry, I did not make it clear, the error reported above is from running sourmash sketch
, not index
.
The environmental variables work, but it's kind of tricky for ordinary users. Each process has a CPU usage of about 30%.
Sorry, I did not make it clear, the error reported above is from running
sourmash sketch
, notindex
.The environmental variables work, but it's kind of tricky for ordinary users. Each process has a CPU usage of about 30%.
I agree, but also not sure how to fix it on the Python side, since it is coming from one of the dependencies... There is probably some way to tell numpy to limit how many processes it uses, but I don't know it from the top of my head
I'd like to compute and index MinHash sketches on GTDB r202 representive genomes.
The sketching step (v4.2.1) is parallelized with 16 or 40 threads on a 160-cores machine. But some processes stopped unexpectedly with errors below, while using 8 threads had no problem.
Note that
KeyboardInterrupt
may not be trigger by me, cause it occurred very early.I notice that each
sourmash
process has a CPU usage of up to 300-800%.But @luizirber said:
Command
Where