nhansen / SVanalyzer

Tools for the analysis of structural variation in genomes
http://svanalyzer.readthedocs.io/
Other
76 stars 14 forks source link

A simple question #7

Open woodoo46 opened 4 years ago

woodoo46 commented 4 years ago

Hi there,

For testing, I made two vcf files, one file has record like this

chr1 108190704 DEL00000334 T \<DEL> . PASS IMPRECISE;SVTYPE=DEL;SVLEN=-3926;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr1;END=108194630 ...

and another file has only one record like this:

chr1 108190708 MantaDEL:180899:0:1:0:0:0 A \<DEL> 999 PASS END=108194629;SVTYPE=DEL;SVLEN=-3921;

Supposedly, they should be merged, since breakpoints are very close to each other (left side 4bp, right side 1bp). But the output file still has two records. Is this expected?

I ran the command as this:

SVmerge --fof ${filename} --ref $HG38

Thanks!

George

woodoo46 commented 4 years ago

A little bit more information: ) I do not have sequence ) the output distance file is empty

Thanks!

nhansen commented 4 years ago

Hi George. Thanks for reporting this. Why is "999" in your ALT field for the second file? I need to look at the code, but I think any line with invalid ALT entries will be skipped. Are there any errors reported in the ".log" file?

nhansen commented 4 years ago

As another note, SVmerge will in general consider deletions without the sequence only when you have an "END" tag in the INFO field (which I see you do for both variants), and will only consider insertions if the inserted sequence is specified either in the ALT field or as a "SEQ=" INFO field entry.

woodoo46 commented 4 years ago

Sorry the manta call was copied wrong, it should be like this: chr1 108190708 MantaDEL:180899:0:1:0:0:0 A \<DEL> 999 PASS END=108194629;SVTYPE=DEL;SVLEN=-3921.....

They all have END tag in the INFO field.

The log file is this:

2020/07/29 10:13:42 Calculating distances between neighboring variants in VCF files listed in test.fof 2020/07/29 10:13:42 Writing vcf file merge.test.fof.clustered.vcf of clustered variants 2020/07/29 10:13:42 Done

Thanks!