fritzsedlazeck / SURVIVOR

Toolset for SV simulation, comparison and filtering
MIT License
353 stars 47 forks source link

Different results with different order in the vcflist #170

Open Yuma248 opened 2 years ago

Yuma248 commented 2 years ago

Hi there

I am using PINDEL, MANTA, and GRIDSS to identify SV in 120 low-coverage WGs. I tried to apply similar filters to all callers, and then merge the sorted VCFs using survivor (SURVIVOR merge vcflist 1000 2 1 1 0 30 PMG.vcf). At first, it looks like worked well, however, when I change the order in the vcflist (from MANTA, GRIDSS, PINDEL, to PINDEL, GRIDSS, MANTA, or GRIDSS, PINDEL,MANTA) I get very different results. Not just in the number of SV but also in the SV types.

ORD | Tot | DEL | DUP | INV | INS | TRA MGP | 40,684 | 30,620 | 678 | 7,437 | 1,845 | 104 PGM | 49,785 | 30,908 | 664 | 9,311 | 7,534 | 1368 GPM | 42,592 | 31,248 | 678 | 1,857 | 7,442 | 1367

Would not you expect similar results regardless of the order of VCFlist? Or does SURVIVE give some weight to the first input VCF?

changhan1110 commented 1 year ago

Hi,

I encounter the same issue and I cannot find any information about this in this github Wiki and README. I am using Delly, Manta, GRIDSS, Lumpy results to merge for your information.

Thanks

dr-ashu-geno commented 1 year ago

Hi, @Yuma248 @changhan1110 Same happened to my data!

Did you guys find any way to solve this problem? I cannot really understand the reason..

@fritzsedlazeck any help on this please? Thank you..

Yuma248 commented 1 year ago

Hi @dr-ashu-geno

Unfortunately, we did not find any solution, and even with further comparations, we did not find a pattern in the merging results. It would be nice to hear from @fritzsedlazeck...