parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
87 stars 19 forks source link

About the case-control analysis in Illumina Short-read WGS Data #108

Closed recepyilmaz01 closed 1 month ago

recepyilmaz01 commented 1 month ago

Hello, Thanks for developing xTEA.

As a junior researcher aiming to improve my skills in the field of bioinformatics, I have encountered an issue while utilizing xTEA's case-control mode to identify possible somatic transposon insertions. When I checked the resulting regions in IGV, in several regions, I found the control samples exhibit longer and more clipped reads compared to the case samples. What could be the reason for this? As an example, I am dropping screenshots from the IGVs.

Would you kindly help me to find a potential solution to my problem?

Many thanks for considering my request. Control Case

simoncchu commented 1 month ago

It looks like there are allele dropouts in the case. See the two heterozygous snps on the right of the screenshot.

recepyilmaz01 commented 1 month ago

Thanks for the reply.

Based on our findings, the allele frequency in the tumor (case) and matched normal (control) samples was determined to be 0.545. How should we interpret an allele frequency of 0.545 in the context of potential allele dropout? Specifically, can we conclude that despite possible allele dropout in the tumor sample, there is an insertion in the region? Or should this insertion in the tumor sample only be considered a sequencing artifact? <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

tSVLEN | TSD | TSDLEN | SUBTYPE | STRAND | AF | INS_INV | REF_REP -- | -- | -- | -- | -- | -- | -- | --

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

236 | +AAAACCAAGGTCAT | 14 | two_side_tprt_both | + | 0.545454545 | Not-5prime-inversion | not_in_Alu_copy -- | -- | -- | -- | -- | -- | -- | --

simoncchu commented 1 month ago

It's a germline insertion. That's why you observed it in both the case and control. I mean in the tumor sample, there is an allele dropout, thus the VAF is lower than in the control.