MGHComputationalPathology / CellTics

Center for Integrated Diagnostics at Mass General Hospital NGS tools
BSD 3-Clause "New" or "Revised" License
3 stars 5 forks source link

reference fasta file provided via --ref-seq option is NOT used if a bam file is not provided via --bam-file #10

Open guruprasada opened 4 years ago

guruprasada commented 4 years ago

Hi @alliemclean, it looks like if we don't provide a BAM file (via --bam-file) but provide our own reference fasta file (via --ref-seq), the tool ignores our fasta file and queries UCSC instead. The tool appears to use our reference fasta file only when we provide a BAM file. Why is that the case? This appears to be happening on lines 588-592 (within bam_and_merge_multiprocess function in vargroup.py), wherein the sequence dictionary (fdict) is retrieved from the reference fasta ONLY if bam_file is not None. Otherwise, fdict is set to None, which forces the get_reference_seq() function to call UCSC.