Open biolancer opened 1 year ago
We discussed today to also include coverage calculations for target regions. This would require BAM and BED files as input and should ideally report the coverage of each individual region (which are often individual exons) and a summary of these. That could be done using mosdepth: https://github.com/brentp/mosdepth
As discussed today, a check on the VCF file integrity (e.g. concordant genomic coordinates to the given reference genome version) could further enhance the pipeline. This would potentially require the integration of BAM files as additional input.
Description of feature
In case of f.e. low sample quality, assessment of read and coverage statistics is relevant for the inclusion or exclusion of potentially clinically relevant mutations and the required data generation should be considered during upstream preprocessing of alignment and variant calls.
Visualization could be achieved by generating IGV compatible data formats (ROI-subsampled BAM, CRAM) during the preprocessing procedures. IGV is capable to import sessions based on HTML and XML data formats, so that in case of known regions of interest for a specific panel/tumor entitity based on readily accessible reference data, a report could be generated which could be (re)loaded into IGV for a quick lookup of re-occuring problematic regions.