Closed RyanVidegar-Laird closed 1 year ago
This is insertion, which means they are absent from the genome. Here, SVLEN
is the insertion length. In general, we think one insertion will have one breakpoint on the genome, however for TE insertions the two strands usually do not break at the exact same location (you can search for target-site-duplication in L1 retrotransposon
to understand more). Thus, there are two breakpoints reported here (POS and END). Hope this helps.
Hi, thanks for all of your work on this tool!
I've ran xTea (with defaults) on 4 short-read WGS samples, and am a bit confused why the
POS
,END
, andSVLEN
values don't seem to align in the VCF output. I would expectEND = POS + SVLEN
, yet it doesn't across any of my samples for Alu or L1 SVs. Is this an error? I'm new to working with SVs, so perhaps it's my misunderstanding.Small output example:
awk '!/orphan/' ./xtea/out/sample-01_ALU.vcf | bcftools query -f'[%CHROM\t%POS\t%INFO/SVLEN\t%END\n]' - | shuf -n 5 | awk '{$5 = $4-$2}1' | column -t