vice87 / gam-ngs

Genomic Assemblies Merger for NGS
GNU General Public License v3.0
26 stars 10 forks source link

Issue with gam-create #12

Open juefish opened 10 years ago

juefish commented 10 years ago

Seems I am having an issue similar to some of the others that have been reported. Ran the test data set through and worked fine, but here is the debugging output:

Starting program: /opt/gam-ngs-master/bin/gam-create --master-bam master.PE.bams.txt --slave-bam slave.PE.bams.txt --min-block-size 5 --output limulus.washu_CA_final.block.5 Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2 Try: zypper install -C "debuginfo(build-id)=3d68b93f8701971da6133437486f3909223534f8" Missing separate debuginfo for /lib64/libz.so.1 Try: zypper install -C "debuginfo(build-id)=4c05d1eb180f9c02b81a0c559c813dada91e0ca4" Missing separate debuginfo for /lib64/libpthread.so.0 Try: zypper install -C "debuginfo(build-id)=bb81b1117fc93fc0eafb3e96eabfca4d8976c879" [Thread debugging using libthread_db enabled] Missing separate debuginfo for /usr/lib64/libboost_graph.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=ecf1135eedfbd99bdd9407181456697644252ec8" Missing separate debuginfo for /usr/lib64/libboost_program_options.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=8731e047567f262aa8b00a231fae81930cf053a8" Missing separate debuginfo for /usr/lib64/libboost_system.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=5eb8d720f9cfbbf4994252c3b92c8ec6cbf000b1" Missing separate debuginfo for /usr/lib64/libboost_filesystem.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=3afa654b8c33ee47addb8d12ef62bd347500b892" Missing separate debuginfo for /usr/lib64/libstdc++.so.6 Try: zypper install -C "debuginfo(build-id)=8af185f68d03ac42c800bcf056d0c38d0e21442c" Missing separate debuginfo for /lib64/libm.so.6 Try: zypper install -C "debuginfo(build-id)=474097bd34a7d5895eb394c73dafa8ce4583dddf" Missing separate debuginfo for /lib64/libgcc_s.so.1 Try: zypper install -C "debuginfo(build-id)=f4530dd94c1cda900b928ba0acd48b024d2c0c62" Missing separate debuginfo for /lib64/libc.so.6 Try: zypper install -C "debuginfo(build-id)=2636ed5ff526582a49ed9b6e982c231335db1620" Missing separate debuginfo for /usr/lib64/libboost_regex.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=ba530855a908d3b8635723c54d89dacbc029820b" Missing separate debuginfo for /usr/lib64/libicuuc.so.42 Try: zypper install -C "debuginfo(build-id)=6d6a1a1773aec08b3eeff745c5510b64d5a23296" Missing separate debuginfo for /usr/lib64/libicui18n.so.42 Try: zypper install -C "debuginfo(build-id)=ff51790de636a30d64aed85684afc630ab0eb36e"

Program received signal SIGABRT, Aborted. 0x00007ffff65d0b55 in raise () from /lib64/libc.so.6

0 0x00007ffff65d0b55 in raise () from /lib64/libc.so.6

1 0x00007ffff65d2131 in abort () from /lib64/libc.so.6

2 0x00007ffff660de2f in __libc_message () from /lib64/libc.so.6

3 0x00007ffff6613558 in malloc_printerr () from /lib64/libc.so.6

4 0x00007ffff661650f in _int_malloc () from /lib64/libc.so.6

5 0x00007ffff66185e7 in malloc () from /lib64/libc.so.6

6 0x00007ffff6e5e05d in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6

7 0x00007ffff6e5e179 in operator new[](unsigned long) () from /usr/lib64/libstdc++.so.6

8 0x00000000004777ee in BamTools::Internal::BamReaderPrivate::LoadNextAlignment(BamTools::BamAlignment&) ()

9 0x0000000000477e36 in BamTools::Internal::BamReaderPrivate::GetNextAlignmentCore(BamTools::BamAlignment&) ()

10 0x0000000000479368 in BamTools::Internal::BamReaderPrivate::GetNextAlignment(BamTools::BamAlignment&) ()

11 0x0000000000469d3c in MultiBamReader::GetNextAlignment (this=0x7fffffffd650, align=..., update_stats=true) at /opt/gam-ngs-master/lib/src/bam/MultiBamReader.cc:344

12 0x0000000000464537 in Block::findBlocks (outblocks=..., bamReader=..., minBlockSize=5, readsMap_1=..., readsMap_2=..., coverage=...) at /opt/gam-ngs-master/lib/src/assembly/Block.cc:495

13 0x00000000004493fa in modules::CreateBlocks::execute (this=0x7fffffffdc20) at /opt/gam-ngs-master/src/CreateBlocks.cc:141

14 0x00000000004471a6 in main (argc=9, argv=0x7fffffffdd28) at /opt/gam-ngs-master/src/gam-create.cc:49

Any thoughts?

Nate

vice87 commented 10 years ago

There might be a problem with the slave BAM file or inside the library I use to handle BAMs (or both). What tool did you use to obtain the master and slave bams? Could you try to swap the master and slave inputs? If you do so, do you get the same error at the same point or earlier (e.g., when loading master bam's reads)?

In the meanwhile I'll update the BamTools library.

juefish commented 10 years ago

Sure, I'll try. I used bowtie2 --end-to-end --very-sensitive settings. I know your test script uses bwa, think that's it? I'll try switching master and slave as well and let you know how it goes.

Nate

On Mon, May 5, 2014 at 1:47 PM, Riccardo Vicedomini < notifications@github.com> wrote:

There might be a problem with the slave BAM file or inside the library I use to handle BAMs (or both). What tool did you use to obtain the master and slave bams? Could you try to swap the master and slave inputs? If you do so, do you get the same error at the same point or earlier (e.g., when loading master bam's reads)?

In the meanwhile I'll update the BamTools library.

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-42215781 .

vice87 commented 10 years ago

I don't know if the problem is due to bowtie (I don't think it is). Moreover, swapping master and slave will definitely not resolve the problem (I'm just trying to understand what is the cause of the bug).

However, I have updated the repository with a recent version of the library used to handle BAM files. Hopefully this fixed the problem. Let me know if it happened.

Best, Riccardo

juefish commented 10 years ago

Riccardo,

I read mapped with BWA and gam-create was able to finish. Not sure how bowtie2 would differ in this case, but it did. Didn't try your update repository yet. I don't have time to now, but will try to do so later to let you know if that fixes the problem.

Now, however, I'm running into issue with gam-merge. Run aborted about this output:

main] Loading blocks [main] Loaded blocks = 4281438 [main] Loading BAMs data [bam] Master PE-alignments file master.PE.bams.txt successfully opened:

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_1_to_washu.bwa.sorted.bam inserts size = 195.128 +/- 30.7749 coverage = 16.6835

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_2_to_washu.bwa.sorted.bam inserts size = 195.128 +/- 30.7749 coverage = 16.6835

/data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_1_to_washu.bwa.sorted.bam inserts size = 419.253 +/- 106.717 coverage = 1.9293

/data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_2_to_washu.bwa.sorted.bam inserts size = 416.926 +/- 106.259 coverage = 2.5483

/data2/projects/limulus/combinedAssembly/genome/gam/454_to_washu.bwa.sorted.bam inserts size = 2606.33 +/- 836.716 coverage = 0.220091 [bam] Slave PE-alignments file slave.PE.bams.txt successfully opened:

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_1_to_ca_assembly.bwa.sorted.bam inserts size = 191.833 +/- 34.7505 coverage = 19.3185

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_2_to_ca_assembly.bwa.sorted.bam inserts size = 191.825 +/- 34.7565 coverage = 19.6631

/data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_1_to_ca_assembly.sorted.bam inserts size = 402.938 +/- 103.229 coverage = 2.53132

/data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_2_to_ca_assembly.sorted.bam inserts size = 402.178 +/- 104.073 coverage = 3.27384

/data2/projects/limulus/combinedAssembly/genome/gam/454_to_ca_assembly.sorted.bam inserts size = 2462.77 +/- 736.555 coverage = 0.255506 [main] Loading contigs data Aborted

Error output said the following:

terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check

Thoughts?

Thanks, Nate Jue

On Tue, May 6, 2014 at 4:10 AM, Riccardo Vicedomini < notifications@github.com> wrote:

I don't know if the problem is due to bowtie (I don't think it is). Moreover, swapping master and slave will definitely not resolve the problem (I'm just trying to understand what is the cause of the bug).

However, I have updated the repository with a recent version of the library used to handle BAM files. Hopefully this fixed the problem. Let me know if it happened.

Best, Riccardo

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-42276644 .

vice87 commented 10 years ago

Could you please write what commands have you run (both gam-create and gam-merge)?

juefish commented 10 years ago

Sure, here you go:

/opt/gam-ngs-master/bin/gam-create --slave-bam master.PE.bams.txt --master-bam slave.PE.bams.txt --min-block-size 5 --output limulus.washu_CA_final.block.5

/opt/gam-ngs-master/bin/gam-merge --master-bam master.PE.bams.txt --slave-bam slave.PE.bams.txt --blocks-file limulus.washu_CA_final.block.5.blocks --master-fasta ../../../limulus.washu.unplaced.scaf.fa --slave-fasta ../../../limulus.final.CA.SSPACE_SCF_merge_Nwblr_SCF_LRNA_SCF.fasta --min-block-size 5 --threads 12 --output limulus.washu_CA_final.merged 2> merge.err

Nate

On Fri, May 9, 2014 at 3:58 AM, Riccardo Vicedomini < notifications@github.com> wrote:

Could you please write what commands have you run (both gam-create and gam-merge)?

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-42641894 .

juefish commented 10 years ago

Riccardo,

Do you think this error may have come from an out-of-memory issue? I was using a system with ~500GB of memory, but there were probably multiple projects running at one time on it. Perhaps, it crashed because of that? The aforementioned error message didn't seem to indicate that was the issue: std out of range. Any thoughts?

Thanks, Nate

On Fri, May 9, 2014 at 9:37 AM, Nathaniel Jue n.jue@uconn.edu wrote:

Sure, here you go:

/opt/gam-ngs-master/bin/gam-create --slave-bam master.PE.bams.txt --master-bam slave.PE.bams.txt --min-block-size 5 --output limulus.washu_CA_final.block.5

/opt/gam-ngs-master/bin/gam-merge --master-bam master.PE.bams.txt --slave-bam slave.PE.bams.txt --blocks-file limulus.washu_CA_final.block.5.blocks --master-fasta ../../../limulus.washu.unplaced.scaf.fa --slave-fasta ../../../limulus.final.CA.SSPACE_SCF_merge_Nwblr_SCF_LRNA_SCF.fasta --min-block-size 5 --threads 12 --output limulus.washu_CA_final.merged 2> merge.err

Nate

On Fri, May 9, 2014 at 3:58 AM, Riccardo Vicedomini < notifications@github.com> wrote:

Could you please write what commands have you run (both gam-create and gam-merge)?

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-42641894 .

vice87 commented 10 years ago

I think the problem is related to an inconsistency between the headers of the BAM files. There might be one (or more) BAMs of the same "type" (i.e., master/slave) which have a different set of sequences. Are you sure the mappings had been done against the same master (or slave) assembly?

vice87 commented 10 years ago

I've updated the repository to print more "debugging infomations" at the point where gam-merge gives you that error.

juefish commented 10 years ago

Riccardo,

I double-checked all the headers for master and slave all respective headers are the same and all were done against the appropriate assembly, so I don't think it that. I ran the debugging run of gam-merge and got this error message when the program breaks:

[getNoBlocksContigs] error: found a block with master id 550225 when the admissible range is [0,286792) [Inferior 1 (process 38116) exited with code 01]

I've attached the gdb.txt file as well, although I'm not sure how much more informative it will be.

Thanks, Nate

On Mon, May 12, 2014 at 4:10 AM, Riccardo Vicedomini < notifications@github.com> wrote:

I've updated the repository to print more "debugging infomations" at the point where gam-merge gives you that error.

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-42805714 .

Starting program: /opt/gam-ngs-master/bin/gam-merge /opt/gam-ngs-master/bin/gam-merge --master-bam master.PE.bams.txt --slave-bam slave.PE.bams.txt --blocks-file limulus.washu_CA_final.block.5.blocks --master-fasta ../../../limulus.washu.unplaced.scaf.fa --slave-fasta ../../../limulus.final.CA.SSPACE_SCF_merge_Nwblr_SCF_LRNA_SCF.fasta --min-block-size 5 --threads 30 --output limulus.washu_CA_final.merged Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2 Try: zypper install -C "debuginfo(build-id)=3d68b93f8701971da6133437486f3909223534f8" Missing separate debuginfo for /lib64/libpthread.so.0 Try: zypper install -C "debuginfo(build-id)=bb81b1117fc93fc0eafb3e96eabfca4d8976c879" [Thread debugging using libthread_db enabled] Missing separate debuginfo for /usr/lib64/libboost_graph.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=ecf1135eedfbd99bdd9407181456697644252ec8" Missing separate debuginfo for /usr/lib64/libboost_program_options.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=8731e047567f262aa8b00a231fae81930cf053a8" Missing separate debuginfo for /usr/lib64/libboost_system.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=5eb8d720f9cfbbf4994252c3b92c8ec6cbf000b1" Missing separate debuginfo for /usr/lib64/libboost_filesystem.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=3afa654b8c33ee47addb8d12ef62bd347500b892" Missing separate debuginfo for /usr/lib64/libstdc++.so.6 Try: zypper install -C "debuginfo(build-id)=8af185f68d03ac42c800bcf056d0c38d0e21442c" Missing separate debuginfo for /lib64/libm.so.6 Try: zypper install -C "debuginfo(build-id)=474097bd34a7d5895eb394c73dafa8ce4583dddf" Missing separate debuginfo for /lib64/libgcc_s.so.1 Try: zypper install -C "debuginfo(build-id)=f4530dd94c1cda900b928ba0acd48b024d2c0c62" Missing separate debuginfo for /lib64/libc.so.6 Try: zypper install -C "debuginfo(build-id)=2636ed5ff526582a49ed9b6e982c231335db1620" Missing separate debuginfo for /usr/lib64/libboost_regex.so.1.46.1 Try: zypper install -C "debuginfo(build-id)=ba530855a908d3b8635723c54d89dacbc029820b" Missing separate debuginfo for /usr/lib64/libicuuc.so.42 Try: zypper install -C "debuginfo(build-id)=6d6a1a1773aec08b3eeff745c5510b64d5a23296" Missing separate debuginfo for /usr/lib64/libicui18n.so.42 Try: zypper install -C "debuginfo(build-id)=ff51790de636a30d64aed85684afc630ab0eb36e" [Inferior 1 (process 38116) exited with code 01] No stack.

vice87 commented 10 years ago

It's really strange. The problem is due to the fact that in the block there is a reference to the 550226th sequence of the master fasta, while this one contains just 286793 sequences. I'm sorry for being pedantic but are you sure you are using the same blocks file you created with gam-create using the same lists of bam as input?

juefish commented 10 years ago

You're right that seems odd, but I've double-checked the file and everything seems to be correct. For instance, here is the list of number of lines in each bam file header:

722319 454_to_ca_assembly.sorted.header 286794 454_to_washu.bwa.sorted.header 722319 illumina_1_to_ca_assembly.bwa.sorted.header 286794 illumina_1_to_washu.bwa.sorted.header 722319 illumina_2_to_ca_assembly.bwa.sorted.header 286794 illumina_2_to_washu.bwa.sorted.header 722319 miSEQ_1_to_ca_assembly.sorted.header 286794 miSEQ_1_to_washu.bwa.sorted.header 722319 miSEQ_2_to_ca_assembly.sorted.header 286794 miSEQ_2_to_washu.bwa.sorted.header

while the master and slave files are the following:

master.PE.bam.txt:

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_1_to_washu.bwa.sorted.bam 50 350 /data2/projects/limulus/combinedAssembly/genome/gam/illumina_2_to_washu.bwa.sorted.bam 50 350 /data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_1_to_washu.bwa.sorted.bam 130 670 /data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_2_to_washu.bwa.sorted.bam 130 670 /data2/projects/limulus/combinedAssembly/genome/gam/454_to_washu.bwa.sorted.bam 500 4500

slave.PE.bam.txt:

/data2/projects/limulus/combinedAssembly/genome/gam/illumina_1_to_ca_assembly.bwa.sorted.bam 50 350 /data2/projects/limulus/combinedAssembly/genome/gam/illumina_2_to_ca_assembly.bwa.sorted.bam 50 350 /data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_1_to_ca_assembly.sorted.bam 130 670 /data2/projects/limulus/combinedAssembly/genome/gam/miSEQ_2_to_ca_assembly.sorted.bam 130 670 /data2/projects/limulus/combinedAssembly/genome/gam/454_to_ca_assembly.sorted.bam 500 4500

And the two commands I used were the following:

/opt/gam-ngs-master/bin/gam-create --slave-bam master.PE.bams.txt --master-bam slave.PE.bams.txt --min-block-size 5 -- output limulus.washu_CA_final.block.5

/opt/gam-ngs-master/bin/gam-merge --master-bam master.PE.bams.txt --slave-bam slave.PE.bams.txt --blocks-file limulus.washu_CA_final.block.5.blocks --master-fasta ../../../limulus.washu.unplaced.scaf.fa --slave-fasta ../../../limulus.final.CA.SSPACE_SCF_merge_Nwblr_SCF_LRNA_SCF.fasta --min-block-size 5 --threads 30 --output limulus.washu_CA_final.merged 2> merge.err

I could re-do the mapping, but I don't think that's the issue, do you?

Thanks, Nate

On Mon, May 19, 2014 at 3:35 AM, Riccardo Vicedomini < notifications@github.com> wrote:

It's really strange. The problem is due to the fact that in the block there is a reference to the 550226th sequence of the master fasta, while this one contains just 286793 sequences. I'm sorry for being pedantic but are you sure you are using the same blocks file you created with gam-create using the same lists of bam as input?

— Reply to this email directly or view it on GitHubhttps://github.com/vice87/gam-ngs/issues/12#issuecomment-43473345 .

vice87 commented 10 years ago

Dear Nate, sorry again for the tardiness of my reply. I this period I'm very busy with my PhD work. If I weren't so busy I would have probably found the problem sooner.

In the gam-create command you specified the master alignments as --slave-bam (and, the slave ones as --master-bam, respectively), while in the gam-merge command you set correctly the first as --master-bam and the second as --slave-bam.

In order to make everything work, you should rebuild the blocks file (swapping master and slave) and then execute the same gam-merge command.

I think I should provide (don't know if/when :P) an additional script to simplify the execution of subsequent gam-create and gam-merge execution to avoid this kind of oversights.

Best, Riccardo