Closed charliechen912ilovbash closed 3 years ago
Hi,
yes, SVIM separates translocations into two lines in order to follow the specification of the Variant Calling Format (see Section 5.4 on page 17).
If you need only one line per translocation you could use a simple Python script similar to the one below.
Cheers David
import sys
from cyvcf2 import VCF, Writer
vcf_file = VCF(sys.argv[1])
out_file = Writer(sys.argv[2], vcf_file)
for variant in vcf_file:
if variant.INFO["SVTYPE"] == "BND":
from_chrom = variant.CHROM
from_pos = variant.POS
alt_string = variant.ALT[0]
#fwd direction at pos1
if alt_string[0] == "N":
pos_fields = alt_string[2:-1].split(":")
assert len(pos_fields) == 2
to_chrom = pos_fields[0]
to_pos = int(pos_fields[1])
#rev direction at pos1
else:
pos_fields = alt_string[1:-2].split(":")
assert len(pos_fields) == 2
to_chrom = pos_fields[0]
to_pos = int(pos_fields[1])
if from_chrom < to_chrom:
out_file.write_record(variant)
elif from_chrom == to_chrom:
if int(from_pos) < int(to_pos):
out_file.write_record(variant)
vcf_file.close()
out_file.close()
Thanks you very much! I will try it.
In the output, I found out that translocation (BND) are separated into two lines, the
position1
andposition2
is reversed, thesvim.BND.xxx
sv ID is also different. I want to merge every translocation event into one line. Is there any recommended method? I am using multi SV callers, and found out that some callers do the same thing for translocations likemanta
andsvaba
some callers do not (only one line for each translocation) likesniffles
andcuteSV
.