xunchen85 / ERVcaller

ERVcaller is a tool designed to accurately detect and genotype non-reference unfixed endogenous retroviruses (ERVs) and other transposable elements (TEs) in the human genome using next-generation sequencing (NGS) data. We evaluated the tools using both simulated and real benchmark whole-genome sequencing (WGS) datasets. ERVcaller is capable to accurately detect various TE insertions of any lengths, particularly ERVs. It allows for the use of a TE reference library regardless of sequence complexity, such as the entire RepBase database. It is easy to install and use with command lines.
http://www.uvm.edu/genomics/software/ERVcaller.html
14 stars 4 forks source link

Assembly of the target ERV #30

Open 1577377232 opened 7 months ago

1577377232 commented 7 months ago

Hi Xun,

I used IGV to check the breakpoints, and extracted the reads mapped within 150bp from TSD, assembled and compared them.(samtools)

Is there a way to assemble a more complete ERV, via discordant reads or split reads? Because I saw in the VCF file,INFO column,INFOR information is exciting! very complete. For example, INFOR=ERV3, 1,7525,7525.

Best, Dan

xunchen85 commented 6 months ago

Hi Dan,

Sorry, with short reads you may only be able to get the boundary sequences if the inserted TE is long. You may consider experimentally validating it in some way or long reads seq.

Best, Xun