Open annerilotter opened 6 months ago
Dear @annerilotter this is indeed a bit tricky i think.. So you want to produce a phased bam file that follows the phasign of the assebly right ? You could use dipcall to call variants of your phased assembly vs. the reference genome. Then take these phased SNV and try to tag the reads in a mapped bam file (e..g whatshapp) and then give the so phased bam file to Sniffles. I am honestly not sure if this will work as it requires Whatshapp to accept the phased SNV from Dipcall and that the mapped bam file corresposods (which hopefully will be the case). Nevertheless, this is how I would try this out.. hope that helps Fritz
Dear @fritzsedlazeck , thank you for the clarification. I will try a couple of things. I just basically need to validate that the SV breakpoints exist in at least half the reads. I think the current Whatshap strategy seems a bit circular.
Kind regards
Wait you just want to know if the SV exist in half of the reads ? Take samplot or IGV directly .. without phasing.
Whatshap is needed to tag the bam file and then report the phasing to Sniffles. If you dont do it in this way the hap 1 or 2 assignment will likely be different to your assembly results because hap1 and 2 are assigned randomly .. Its just within the phaseblock you can rely on them
Hi @fritzsedlazeck , that may work for a few SVs but not thousands. Anycase, I see that sniffles picked up the variant I was looking to validate but the breakpoints are not exactly the same even though the event is in the same region more or less (it is a large inversion). I wanted to use it to get a recall rate between the two methods (SyRI and Sniffles) as a manner to independently validate SVs identified by SyRI. I hope that makes sense. I guess a bed intersect would do, or something like SURVIVOR? I would probably have to accept that Sniffles has an SV and not consider the SV type (i.e. presence-absence of variant rather than the type).
Thank you very much for the help.
Hi, first I want to say this is a very nice tool.
My question is in regards to calling phased variants. I currently have phased assemblies, and have called SVs with SyRI. I would like to use this read alignment method as an independent validation of the SVs detected with another tool similar to what was done in this paper: https://www.sciencedirect.com/science/article/pii/S1674205224000820?via%3Dihub#sec3
I currently do not have a phased bam as we used HifiAsm+Hi-C for phased assembly. Can I assume that haplotype-specific variants will only have half the reads spanning that specific breakpoint or would it be better to do a read to assembly based alignment method to generate a phased bam?
Any help would be appreciated,
Kind regards