luntergroup / octopus

Bayesian haplotype-based mutation calling
MIT License
302 stars 38 forks source link

Early crash in multithreaded mode #52

Closed cinquin closed 5 years ago

cinquin commented 5 years ago

Describe the bug The program fails early when run in multithreaded mode (both for debug and release builds). Note that the input BAM file only has reads mapped to chromosome 1. Stack trace of the crashed thread below for the debug build.

When changing the --thread argument to --thread 1, the problem does not occur.

* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x4)
  * frame #0: 0x0000000102c9ee1c libhts.2.dylib`bgzf_read_block + 1108
    frame #1: 0x0000000102c9f973 libhts.2.dylib`bgzf_read + 69
    frame #2: 0x0000000102cb5f9b libhts.2.dylib`bam_read1 + 51
    frame #3: 0x0000000102cb6e20 libhts.2.dylib`bam_readrec + 34
    frame #4: 0x0000000102cafdf1 libhts.2.dylib`hts_itr_next + 243
    frame #5: 0x0000000100306c21 octopus_DEBUG`octopus::io::HtslibSamFacade::HtslibIterator::operator++() + 161
    frame #6: 0x000000010030a5a0 octopus_DEBUG`octopus::io::HtslibSamFacade::extract_read_positions(octopus::GenomicRegion const&, unsigned long) const + 272
    frame #7: 0x000000010030b122 octopus_DEBUG`octopus::io::HtslibSamFacade::extract_read_positions(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, octopus::GenomicRegion const&, unsigned long) const + 370
    frame #8: 0x000000010030bcc6 octopus_DEBUG`octopus::io::HtslibSamFacade::extract_read_positions(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, octopus::GenomicRegion const&, unsigned long) const + 390
    frame #9: 0x000000010036f232 octopus_DEBUG`octopus::io::ReadReader::extract_read_positions(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, octopus::GenomicRegion const&, unsigned long) const + 146
    frame #10: 0x0000000100337852 octopus_DEBUG`octopus::io::ReadManager::find_covered_subregion(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, octopus::GenomicRegion const&, unsigned long) const + 898
    frame #11: 0x0000000101043f14 octopus_DEBUG`octopus::(anonymous namespace)::find_max_window(octopus::ContigCallingComponents const&, octopus::GenomicRegion const&) + 148
    frame #12: 0x0000000101043940 octopus_DEBUG`octopus::(anonymous namespace)::propose_call_subregion(octopus::ContigCallingComponents const&, octopus::GenomicRegion const&, boost::optional<unsigned int>) + 80
    frame #13: 0x0000000101042ca9 octopus_DEBUG`octopus::(anonymous namespace)::make_region_tasks(octopus::GenomicRegion const&, octopus::ContigCallingComponents const&, octopus::ExecutionPolicy, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > >&, octopus::(anonymous namespace)::TaskMakerSyncPacket&, bool, bool) + 201
    frame #14: 0x00000001010429fe octopus_DEBUG`octopus::(anonymous namespace)::make_contig_tasks(octopus::ContigCallingComponents const&, octopus::ExecutionPolicy, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > >&, octopus::(anonymous namespace)::TaskMakerSyncPacket&, bool) + 1022
    frame #15: 0x0000000101040914 octopus_DEBUG`octopus::(anonymous namespace)::make_tasks_helper(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > >, octopus::(anonymous namespace)::ContigOrder, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > > > > >&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, octopus::GenomeCallingComponents&, unsigned int, octopus::ExecutionPolicy, octopus::(anonymous namespace)::TaskMakerSyncPacket&) + 756
    frame #16: 0x000000010104c1bc octopus_DEBUG`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > >, octopus::(anonymous namespace)::ContigOrder, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > > > > >&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, octopus::GenomeCallingComponents&, unsigned int, octopus::ExecutionPolicy, octopus::(anonymous namespace)::TaskMakerSyncPacket&), std::__1::reference_wrapper<std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > >, octopus::(anonymous namespace)::ContigOrder, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::queue<octopus::(anonymous namespace)::Task, std::__1::deque<octopus::(anonymous namespace)::Task, std::__1::allocator<octopus::(anonymous namespace)::Task> > > > > > >, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::reference_wrapper<octopus::GenomeCallingComponents>, unsigned int, octopus::ExecutionPolicy, std::__1::reference_wrapper<octopus::(anonymous namespace)::TaskMakerSyncPacket> > >(void*) + 1580
    frame #17: 0x00007fff5a79a305 libsystem_pthread.dylib`_pthread_body + 126
    frame #18: 0x00007fff5a79d26f libsystem_pthread.dylib`_pthread_start + 70
    frame #19: 0x00007fff5a799415 libsystem_pthread.dylib`thread_start + 13

Command Command line to run octopus:

$ octopus --reads input.bam --reference hg38.fa --assemble-all --full-bamout --split-bamout realigned -o calls.vcf --thread

Desktop (please complete the following information):

dancooke commented 5 years ago

Thanks for the bug report. The problem is that the operating system maximum open file limit is being exceeded (i.e. ulimit -n). When calling with multiple threads, Octopus creates a temporary BCF file for each unique contig in the calling region set. Since you're not explicitly specifying any input regions in your command, the default behaviour is it to call all contigs in the reference genome. That's all 3366 contigs in GRCh38_full_analysis_set_plus_decoy_hla.fa, even though reads are only mapped to chr1.

A robust solution to this is not going to be simple. However, it's fairly easy for you to avoid this problem by either:

dancooke commented 5 years ago

For my own information, some ideas to help - but not solve - this problem:

cinquin commented 5 years ago

Thanks; adding -T chr1 does work around the problem.

dancooke commented 5 years ago

From 42fa364aa3ca64ea25613458c8f6a45dbab5f34f, Octopus will open and close temporary BCFs as needed, which should reduce the occurrence of this type of error. There is also better error reporting. You should now be able to run your command without the -T chr1 (although I'd still recommend including it if you're only calling chromosome 1).