epi2me-labs / pore-c-py

Other
12 stars 4 forks source link

segfault with pore-c-py annotate #8

Open ahcm opened 9 months ago

ahcm commented 9 months ago
[04:46:22 - AnntateAln] Found 427679179 monomers in 28247142 concatemers.
Segmentation fault (core dumped)
srun: error: HOSTNAME: task 0: Exited with exit code 139
ahcm commented 9 months ago

Running again, this time with:

export NUMEXPR_MAX_THREADS=64
pore-c-py annotate --threads 64

Results the same output (size) and also segfaults. Another dataset also segfaults.

ahcm commented 9 months ago

Same with current git version 5f40b7e. Both datasets always segfault after producing the same amount of output (138GB and 191GB).

cjw85 commented 9 months ago

Can you please provide the full pore-c-py annotate command that you are running? I think it would be worthwhile running also samtools quickcheck on the input you are giving to the program.

I'm unsure why you are setting NUMEXPR_MAX_THREADS. The numexpr Python package is not a direct dependency of pore-c-py (nor does it look like a transitive dependency after my quick check). Note also that the --threads parameter is used only to control the BAM compression threads used by htslib (through pysam). Setting the value to more than a handful is unlikely to result in linear performance gains.

ahcm commented 9 months ago

samtools quickcheck shows nothing.

Without NUMEXPR_MAX_THREADS set, it complains about it being not set (and set it to a default).

srun -c 64 pore-c-py annotate "${INPUT}-${ENZYME}.bam.fastq-minimap-digest-ref.sam" "${OUTPUT}" --monomers --stdout --summary --chromunity

ahcm commented 9 months ago

Do I even need to run annotate for yahs?