CFIA-NCFAD / wgscovplot

The Whole Genome Sequencing Coverage Plot (wgscovplot) is a tool to generate HTML Interactive Coverage Plot given coverage depth information, variants and DNA Gene features
Other
17 stars 6 forks source link

Reference sequence is assumed to be available from NCBI if not found in input directory #54

Open peterk87 opened 8 months ago

peterk87 commented 8 months ago

When parsing depths from a BAM file, the reference sequence ID is retrieved from the BAM header info, however, that reference sequence may not always be an NCBI sequence. This is an issue with IRMA output where BAM files have reference sequence IDs that are fairly generic (e.g. "A_PB2"). In this case, it may be preferable to not try to retrieve any ref seq and omit it entirely from the output or spoof it based on a consensus sequence.