parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
102 stars 23 forks source link

Issue with Somatic Insertion Genotyping in Case-Control Mode #118

Closed MuratcanMentes closed 4 months ago

MuratcanMentes commented 5 months ago

Hello,

First, I want to express my gratitude for developing xTEA.

I am encountering an issue while using xTEA’s case-control mode to identify somatic transposon (specifically Alu) insertions. Based on my understanding from other issues, the genotype should indicate somatic insertion if it is 1/1 or 0/1 in the tumor and 0/0 in the matched control. However, in my case-control analysis, the resulting VCF file has all genotypes labeled as “./.”.

Could you help me understand why this might be happening? Additionally, are there alternative methods to determine somatic insertions in the resulting VCF file?

Many thanks for considering my request.

simoncchu commented 5 months ago

Hi, case-ctrl mode is used to identify somatic TE insertions. We don't expect a genotype information for somatic insertions. We classify germline ones to heterozygous or homozygous to indicate they are inherited from one or both of the parents. But somatic insertions formed post-zygotic, thus should be absent from both parents.

MuratcanMentes commented 5 months ago

Thank you for your quick response and clarification.

The IGV screenshots of some regions have caused some confusion. Despite xTEA predicting these regions as somatic insertions, I observed clipped reads and discordant reads in the matched control. Could you help me understand the reason for this discrepancy? Did I misunderstand xTEA's results? Does xTEA identify the regions with confirmed insertions or potential somatic insertions?

Thank you for your time and assistance.

Control_1 Tumor

simoncchu commented 5 months ago

If this is the case, then it's a false positive. Just not sure why this is reported, since the signal in control is strong.