sebhtml / ray

Ray -- Parallel genome assemblies for parallel DNA sequencing
http://denovoassembler.sf.net
Other
65 stars 12 forks source link

seg fault during scaffolding in 2.2.0 #191

Closed sebhtml closed 10 years ago

sebhtml commented 10 years ago
[r103-n82:09563] *** Process received signal ***
[r103-n82:09563] Signal: Segmentation fault (11)
[r103-n82:09563] Signal code: Address not mapped (1)
[r103-n82:09563] Failing at address: 0x7f2615d099c8
[r103-n82:09563] [ 0] /lib64/libpthread.so.0 [0x7f28bb9b8be0]
[r103-n82:09563] [ 1] Ray(_ZN12ArrayOfReads2atEm+0x1a) [0x5490da]
[r103-n82:09563] [ 2] Ray(_ZN16MessageProcessor33call_RAY_MPI_TAG_GET_READ_MARKERSEP7Message+0x78) [0x4c1098]
[r103-n82:09563] [ 3] Ray(_ZN11ComputeCore10runVanillaEv+0xf4) [0x581054]
[r103-n82:09563] [ 4] Ray(_ZN11ComputeCore3runEv+0x80) [0x585640]
[r103-n82:09563] [ 5] Ray(_ZN7Machine5startEv+0x134a) [0x477d8a]
[r103-n82:09563] [ 6] Ray(main+0x27b) [0x475c1b]
[r103-n82:09563] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x7f28bb670994]
[r103-n82:09563] [ 8] Ray(_ZNSt8ios_base4InitD1Ev+0x51) [0x471b99]
[r103-n82:09563] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 31 with PID 9563 on node r103-n82 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
sebhtml commented 10 years ago

script:

$ cat Legionella.sh 
#PBS -S /bin/bash
#PBS -N Legionella-2
#PBS -o Legionella-2.stdout
#PBS -e Legionella-2.stderr
#PBS -A nne-790-ac
#PBS -l walltime=00:08:00:00
#PBS -l nodes=4:ppn=8
cd $PBS_O_WORKDIR

module load apps/ray/2.2.0-2-MAXKMERLENGTH=32

mpiexec -n 32 \
Ray \
-k 31 \
-o Legionella-2 \
-p \
Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz \
Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz \
-p \
Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz \
Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz \
-p \
Legionella/PE/run3_ID120371_Lane1_R1_1.fastq \
Legionella/PE/run3_ID120371_Lane1_R2_1.fastq \

Check if this is reproducible and also if this was fixed in master.

sebhtml commented 10 years ago

also try with ASSERT=y with 2.2.0

sebhtml commented 10 years ago

To reproduce:

colosse.calculquebec.ca /rap/nne-790-ab/projects/mate-test-2

mpiexec -n 32 Ray \
 -k \
 31 \
 -o \
 Legionella-4 \
 -p \
 Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz \
 Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz \
 -p \
 Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz \
 Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz \
 -p \
 Legionella/PE/run3_ID120371_Lane1_R1_1.fastq \
 Legionella/PE/run3_ID120371_Lane1_R2_1.fastq
sebhtml commented 10 years ago

The number of sequences in _1 and in _2 are not the same for MP1 !!!

{{{

$ cat Legionella-4/FilePartition.txt

File Name FirstSequence LastSequence NumberOfSequences

0 Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz 0 342872 342873 1 Legionella/MP1/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz 342873 695237 352365 2 Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R1_001.fastq.gz 695238 888221 192984 3 Legionella/MP2/Legionella-pneumophila_CCGTCC_L001_R2_001.fastq.gz 888222 1081205 192984 4 Legionella/PE/run3_ID120371_Lane1_R1_1.fastq 1081206 1675167 593962 5 Legionella/PE/run3_ID120371_Lane1_R2_1.fastq 1675168 2269129 593962

}}}

Resolution: INVALID