baoe / AlignGraph

Algorithm for secondary de novo genome assembly guided by closely related references
166 stars 23 forks source link

Segfault #1

Closed dbrami closed 10 years ago

dbrami commented 10 years ago

I'm getting a segfault on 1 TB RAM machine comparing two small bacterial genomes.

Here's the stdout: cmd-> cat alignGraph.out AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved

(0) Alignment finished

CHROMOSOME 0: (1) chromosome loaded (2) contig alignment loaded

Here's error: [2]+ Segmentation fault /sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta --read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta --distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa --remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.err

Also, you should not hardcode the number of processors for bowtie2 to 8 - we have 64, the prog should pick max at runtime.

baoe commented 10 years ago

Thanks for the bug report.

I'm afraid it would be very hard to fix the problem without the specific data. It would be the best if you could send me the fraction (e.g. 1K) reads causing the problem; otherwise, maybe you could tell me more about the reads (e.g. length and number) but this information may not help much.

In addition, you could try to change the distanceLow and distanceHigh options from 1000 to "insert length minus 1000" and "insert length plus 1000" as shown in the manual. I'm not sure if this is cause of the problem, since the 1000 parameter setting only causes bad performance but not segment fault on my various test data.

I'm getting a segfault on 1 TB RAM machine comparing two small bacterial genomes.

Here's the stdout: cmd-> cat alignGraph.out AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved

(0) Alignment finished

CHROMOSOME 0: (1) chromosome loaded (2) contig alignment loaded

Here's error: [2]+ Segmentation fault /sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta --read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta --distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa --remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.err

Also, you should not hardcode the number of processors for bowtie2 to 8 - we have 64, the prog should pick max at runtime.


Reply to this email directly or view it on GitHub: https://github.com/baoe/AlignGraph/issues/1

dbrami commented 10 years ago

Hi Bao, I think you should provide a couple of command-line examples as I struggled to get it right. The instructions sound like the command should include signed integers which is problematic. I have a mean insert size of 590 with a SD 200 so here are the params I used: '--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'. The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G 0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa 4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa 8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2 0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2 8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa 1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa 1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa 2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa 3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie 7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie

baoe commented 10 years ago

Hi, Daniel,

I find your left read file has a different size from the right read file. I think this could be the cause of the problem, since I didn't consider this situation that two pairs have different lengths. I have made the corresponding updates to AlignGraph to fit this situation, so you could try with the current software version and see if the problem has been solved.

I have also updated the manual to make it clearer.

Thanks, Bao

Hi Bao, I think you should provide a couple of command-line examples as I struggled to get it right. The instructions sound like the command should include signed integers which is problematic. I have a mean insert size of 590 with a SD 200 so here are the params I used: '--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'. The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G 0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa 4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa 8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2 0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2 8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa 1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa 1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa 2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa 3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie 7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub: https://github.com/baoe/AlignGraph/issues/1#issuecomment-47131940

dbrami commented 10 years ago

Yes trimmed reads often have different lengths. I will try tomorrow.

On Wed, Jun 25, 2014 at 4:47 PM, Bao notifications@github.com wrote:

Hi, Daniel,

I find your left read file has a different size from the right read file. I think this could be the cause of the problem, since I didn't consider this situation that two pairs have different lengths. I have made the corresponding updates to AlignGraph to fit this situation, so you could try with the current software version and see if the problem has been solved.

I have also updated the manual to make it clearer.

Thanks, Bao

Hi Bao, I think you should provide a couple of command-line examples as I struggled to get it right. The instructions sound like the command should include signed integers which is problematic. I have a mean insert size of 590 with a SD 200 so here are the params I used: '--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'. The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G 0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa 4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa 8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2 0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2 4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa 5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2 1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2 8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa 1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa 1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa 2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa 3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie 7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub: https://github.com/baoe/AlignGraph/issues/1#issuecomment-47131940

— Reply to this email directly or view it on GitHub https://github.com/baoe/AlignGraph/issues/1#issuecomment-47173078.

dbrami commented 10 years ago

It has stil crashed with a 'Segmentation fault'. Here's the stdout:

(0) Alignment finished

CHROMOSOME 0: (1) chromosome loaded (2) contig alignment loaded

dbrami commented 10 years ago

There's may be a confounding factor here that most of the reads Will not map to my assembled contigs. I have selected a couple of contigs from a small metagenomic assembly but supplied the program with all the reads.

baoe commented 10 years ago

Can you send me this time's printout by "ll tmp" just like yesterday?

There's may be a confounding factor here that most of the reads Will not map to my assembled contigs. I have selected a couple of contigs from a small metagenomic assembly but supplied the program with all the reads.


Reply to this email directly or view it on GitHub: https://github.com/baoe/AlignGraph/issues/1#issuecomment-47259278

kmhernan commented 10 years ago

Hello,

I believe I am having this same issue. I have added several couts and uncommented some of the ones you had in place to try to locate where the issue occurs. It definitely makes it to the updateKMer function, and the updateKBases function (around line 1184). I think it is in the "goto cont" statement on line 1187 (I added some more prints so not exact). The last print statement before the segfault is just before the first if() control flow of the cont: statement (line 1284). I did have a print if it passed the first if() statment here and it did not print, but I did not have prints for the other if()s. I just added them, recompiled and am re-running. I will update when it's available unless you think this is pretty much useless.

kmhernan commented 10 years ago

Ok, here is an update...

Here is a snippet of your code mixed with my prints (starting around line 1275):

cont:
    cout << "cont: A" << endl;
    k2.traversed = 0;
    k2.s = nextS;
    k2.chromosomeID0 = nextID0;
    k2.chromosomeOffset0 = nextOffset0;
    k2.coverage = 0;
    k2.A = k2.C = k2.G = k2.T = k2.N = 0;

    cout << "cont: B" << endl;
    cout << "nextid: " << nextID << ", nextOffset: " << nextOffset << endl;
    cout << genome[nextID][nextOffset].contiMer.size() << endl;
    cout << "nextid0: " << nextID0 << ", nextOffset0: " << nextOffset0 << endl;
    cout << genome[nextID0][nextOffset0].contiMer.size() << endl;

I have additional cout statements at the beginning of each if() statement and just before each if() statements; however the last few lines of the std out are:

cont: A
cont: B
nextid: 0, nextOffset: 28150056
0
nextid0: 4294967295, nextOffset0: 4294967295

So, it appears that for some reason the segfault is caused by trying to lookup indices that don't exist (genome[nextID0][nextOffset0]). Any thoughts?