hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
179 stars 56 forks source link

Tricking SAGE to work with a modified ref genome #516

Closed Pranav-Garg closed 4 months ago

Pranav-Garg commented 4 months ago

I would like to use alignments generated to a coordinate-compatible version of hg38 (https://doi.org/10.1016/j.jmoldx.2021.10.013) which includes "random" and unplaced scaffolds in addition to the standard (22XYM) chromosomes, along with a hard-masked locus in chr21.

As mentioned in #401 and #397 your tools only work with no-alt hg19 and hg38. Do you have any suggestions to port over or edit the bam alignments in such a way that they would work with the hg38 setting in your tools? For example, editing the header tags, removing certain lines if they refer to the extra chromosomes, etc? When running as-is, I run into the same issue as #401. Using the standard hg38 would be problematic for my use-case for reasons described in the linked paper.

I'm primarily interested in SAGE and PURPLE.

charlesshale commented 4 months ago

Could you try this running with the latest Sage version: https://github.com/hartwigmedical/hmftools/releases/tag/sage-v3.4

and see if the problem persists. Thanks.

Pranav-Garg commented 4 months ago

Confirmed that SAGE-v3.4 works on my bams. Testing GRIDSS+PURPLE next,

Pranav-Garg commented 4 months ago

Confirmed that GRIDSS+PURPLE also works well with my genome. If only I had downloaded the jars a couple weeks later... Thanks for fixing it anyway!