Hi @alliemclean, it looks like if we don't provide a BAM file (via --bam-file) but provide our own reference fasta file (via --ref-seq), the tool ignores our fasta file and queries UCSC instead. The tool appears to use our reference fasta file only when we provide a BAM file. Why is that the case? This appears to be happening on lines 588-592 (within bam_and_merge_multiprocess function in vargroup.py), wherein the sequence dictionary (fdict) is retrieved from the reference fasta ONLY if bam_file is not None. Otherwise, fdict is set to None, which forces the get_reference_seq() function to call UCSC.
Hi @alliemclean, it looks like if we don't provide a BAM file (via --bam-file) but provide our own reference fasta file (via --ref-seq), the tool ignores our fasta file and queries UCSC instead. The tool appears to use our reference fasta file only when we provide a BAM file. Why is that the case? This appears to be happening on lines 588-592 (within bam_and_merge_multiprocess function in vargroup.py), wherein the sequence dictionary (fdict) is retrieved from the reference fasta ONLY if bam_file is not None. Otherwise, fdict is set to None, which forces the get_reference_seq() function to call UCSC.