insertion sequences - Githubissues

parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics

Other

99 stars 23 forks source link

insertion sequences #36

Closed dr-ashu-geno closed 2 years ago

dr-ashu-geno commented 2 years ago

Hi.

Thank you for developing xTea. I ran xTea on all my samples, successfully. Now, I have a question: Seems that xTea does not report the insertions sequences. Right? If so, is there any way to extract the sequence of each ME?

Thank you in advance, Best,

simoncchu commented 2 years ago

The reads are collected for each insertion (if longer than the insert size, then are the two tail-side reads) under tmp/cns folder named temp_disc.sam and temp_clip.sam. You can split out the reads of each insertion and do a local assembly. I'll export an option for this in the next release.

dr-ashu-geno commented 2 years ago

Thank you so much for your reply.

And a short question, have you ever tried merging the ME insertion sites across all the samples, and genotyping each sample for the list of merged sites? If so, what programs do you recommend me use for this?

Thank you in advance, Best regards,

simoncchu commented 2 years ago

We don't have a joint genotyping module right now. Only have the gVCF level genotype. When I tried to merge them with some in-house script, if an insertion is not in a samples, I set the genotype as "0/0". I was told some tool like survivor could do the vcf merging, although I didn't try before.