ogotoh / spaln

Genome mapping and spliced alignment of cDNA or amino acid sequences
GNU General Public License v2.0
94 stars 16 forks source link

Spaln error "invalid next size" #32

Closed marchoeppner closed 3 years ago

marchoeppner commented 3 years ago

Hi,

I have been struggling using Spaln in a "production" setting due to various crashes. One of them below:

Spaln version: 2.4.0 (191114, Bioconda)

Genome sequence: Homo sapiens GRCh38, unmasked (ftp://ftp.ensembl.org/pub/release-100/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz)

The crash-causing protein:

Q9HBE5.1 1 MPRGWAAPLLLLLLQGGWGCPDLVCYTDYLQTVICILEMWNLHPSTLTLTWQDQYEELKDEATSCSLHRS AHNATHATYTCHMDVFHFMADDIFSVNITDQSGNYSQECGSFLLAESIKPAPPFNVTVTFSGQYNISWRS DYEDPAFYMLKGKLQYELQYRNRGDPWAVSPRRKLISVDSRSVSLLPLEFRKDSSYELQVRAGPMPGSSY QGTWSEWSDPVIFQTQSEELKEGWNPHLLLLLLLVIVFIPAFWSLKTHPLWRLWKKIWAVPSPERFFMPL YKGCSGDFKKWVGAPFTGSSLELGPWSPEVPSTLEVYSCHPPRSPAKRLQLTELQEPAELVESDGVPKPS FWPTAQNSGGSAYSEERDRPYGLVSIDTVTVLDAEGPCTWPCSCEDDGYPALDLDAGLEPSPGLEDPLLD AGTTVLSCGCVSAGSPGLGGPLGSLLDRLKPPLADGEDWAGGLPWGGRSPGGVSESEAGSPLAGLDMDTF DSGFVGSDCSSPVECDFTSPGDEGPPRSYLRQWVVIPPPLSSPGPQAS

Command used to index genome: spaln -W -KP -E -t9 genome_spaln.fa

Command used to align: spaln -o bla -Q7 -TTetrapod -O12 -dgenome_spaln Q9HBE5.1.fa

Crash message (truncated): Error in `spaln': free(): invalid next size (normal): 0x00002b650015a350 ======= Backtrace: ========= /lib/libc.so.6(+0x750fb)[0x2b62827a10fb] /lib/libc.so.6(+0x7ac36)[0x2b62827a6c36] /lib/libc.so.6(+0x7b9b3)[0x2b62827a79b3] spaln(+0x31e1e)[0x55c643756e1e] spaln(+0x398d8)[0x55c64375e8d8] spaln(+0x39ade)[0x55c64375eade] spaln(+0x39db3)[0x55c64375edb3] spaln(+0x39e8d)[0x55c64375ee8d] spaln(+0x39db3)[0x55c64375edb3] spaln(+0x3bcd6)[0x55c643760cd6] spaln(+0x3b473)[0x55c643760473] spaln(+0x3bd14)[0x55c643760d14] spaln(+0x3b473)[0x55c643760473] spaln(+0x3bd14)[0x55c643760d14] spaln(+0x3b617)[0x55c643760617] spaln(+0x3be34)[0x55c643760e34] spaln(+0x3c244)[0x55c643761244] spaln(+0x736f)[0x55c64372c36f] spaln(+0x8149)[0x55c64372d149] spaln(+0x8c25)[0x55c64372dc25] spaln(+0x9465)[0x55c64372e465] spaln(+0x9b94)[0x55c64372eb94] /lib/libpthread.so.0(+0x81da)[0x2b62822131da] /lib/libc.so.6(clone+0x6d)[0x2b62828158ed] ======= Memory map: ======== 2b6281fe9000-2b6282009000 r-xp 00000000 07:00 143 /lib/ld-2.18.so 2b6282009000-2b628200b000 rw-p 00000000 00:00 0 2b628200b000-2b628200c000 rw-p 00000000 00:00 0 2b628200c000-2b628200f000 r--p 00000000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b628200f000-2b628201d000 r-xp 00003000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b628201d000-2b6282023000 r--p 00011000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b6282023000-2b6282024000 ---p 00017000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b6282024000-2b6282025000 r--p 00017000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b6282025000-2b6282026000 rw-p 00018000 07:00 2319 /usr/local/lib/libz.so.1.2.11 2b6282026000-2b62820c8000 r--p 00000000 07:00 2309 /usr/local/lib/libstdc++.so.6.0.26 2b62820c8000-2b6282147000 r-xp 000a2000 07:00 2309 /usr/local/lib/libstdc++.so.6.0.26 2b6282147000-2b6282188000 r--p 00121000 07:00 2309 /usr/local/lib/libstdc++.so.6.0.26 2b6282188000-2b6282193000 r--p 00161000 07:00 2309 /usr/local/lib/libstdc++.so.6.0.26 2b6282193000-2b6282197000 rw-p 0016c000 07:00 2309 /usr/local/lib/libstdc++.so.6.0.26 2b6282197000-2b628219b000 rw-p 00000000 00:00 0 2b628219b000-2b628219e000 r--p 00000000 07:00 2294 /usr/local/lib/libgcc_s.so.1 2b628219e000-2b62821aa000 r-xp 00003000 07:00 2294 /usr/local/lib/libgcc_s.so.1 2b62821aa000-2b62821ad000 r--p 0000f000 07:00 2294 /usr/local/lib/libgcc_s.so.1 2b62821ad000-2b62821ae000 r--p 00011000 07:00 2294 /usr/local/lib/libgcc_s.so.1 2b62821ae000-2b62821af000 rw-p 00012000 07:00 2294 /usr/local/lib/libgcc_s.so.1 2b62821af000-2b62821b4000 rw-p 00000000 00:00 0 2b6282208000-2b6282209000 r--p 0001f000 07:00 143 /lib/ld-2.18.so 2b6282209000-2b628220a000 rw-p 00020000 07:00 143 /lib/ld-2.18.so 2b628220a000-2b628220b000 rw-p 00000000 00:00 0 2b628220b000-2b6282224000 r-xp 00000000 07:00 163 /lib/libpthread-2.18.so 2b6282224000-2b6282423000 ---p 00019000 07:00 163 /lib/libpthread-2.18.so

marchoeppner commented 3 years ago

I should add that this goes away, when I change "-Q7" to "-Q6". But since I want to build a production-capable pipeline, I cannot randomly change around values for Q. I think I am not fully understanding what the actual difference between Q5-7 is and in what case which value might be appropriate... Maybe this could be added to the usage information.

ogotoh commented 3 years ago

Dear marchoeppner,

I downloaded the human genome from the ftp site you showed, formatted it (but without -E option), and then ran spaln with the query sequence you provided. However, I could not reproduce the error you reported on my system (WSL Ubuntu1~18.04). The query exactly matches the genomic sequence, the distributions of exon and intron lengths are normal, and there is no paralog that potentially perturbs the orthologous alignment. If spaln runs normally, the results would not depend on -Q5-7, because recursive HSP search will not be conducted (the argument mod 4 specifies the maximal depth of recursion). I suggest you to compile spaln from the source in this site on your system, and try again. The current version is 2.4.1. To obtain the previous version (2.4.0), please checkout Ver.2.4.04.

I do not recommend to use -E option as it is still under a developing stage.

Osamu,