mkirsche / Jasmine

Jasmine: SV Merging Across Samples
MIT License
174 stars 16 forks source link

Merge accross tools instead of merging accross samples #16

Closed clairemerot closed 3 years ago

clairemerot commented 3 years ago

Hello, I really like the process and accuracy of Jasmine. I see in the manual and doc that this was designed to merge SVs across samples.

We would like (and we are trying ) to use it to merge SVs detected on the same sample but with different tools. Would you say that a different set of parameters should be used in such a case? What about merging SVs detected for different samples and different tools? Has anyone tried that? what do you think?

It seems to work for some merging (e.g. SVs detected by different tools for long-reads) but not so much for other merging (SVs detected from long-reads, and Sv detected from short-reads).

Thanks for your help! Claire

mkirsche commented 3 years ago

Hi Claire,

Thanks for your interest in Jasmine! We have had success using the same (default) parameters for merging across tools, as well as across sequencing technologies.

For merging across both tools and samples at the same time, I would recommend merging all of the VCFs together at once with Jasmine, and then in the downstream processing, using the SUPP_VEC INFO field to determine which samples the variant is present in. Depending on which tools you're using, how confident you are in them, and how much the tools agree with each other, you could filter out SVs that aren't supported by multiple (or all) tools in at least one of the samples they are in.

Jasmine can be used to merge short-read calls as well as between short-read and long-rad calls, but depending on which INFO fields are present in the short-read calls you may have to reformat the file or use additional parameters. One of the main differences I've seen is that short-read callers tend to not include the STRANDS INFO field (indicating breakpoint directionality), so would not be able to merge with sniffles calls which do have that field. To circumvent this, you can use the --ignore_strand parameter in Jasmine.

I hope that helps, and please let me know if you have any further questions! Melanie

brentp commented 3 years ago

Hi Melanie, One issue with merging across tools is that Jasmine does not output all needed info/format fields to the merged VCF. It seems it uses the header from the first VCF but doesn't output additional fields from later VCFs. Anyway that could be supported? thanks