ERVcaller is a tool designed to accurately detect and genotype non-reference unfixed endogenous retroviruses (ERVs) and other transposable elements (TEs) in the human genome using next-generation sequencing (NGS) data. We evaluated the tools using both simulated and real benchmark whole-genome sequencing (WGS) datasets. ERVcaller is capable to accurately detect various TE insertions of any lengths, particularly ERVs. It allows for the use of a TE reference library regardless of sequence complexity, such as the entire RepBase database. It is easy to install and use with command lines.
I used IGV to check the breakpoints, and extracted the reads mapped within 150bp from TSD, assembled and compared them.(samtools)
Is there a way to assemble a more complete ERV, via discordant reads or split reads? Because I saw in the VCF file,INFO column,INFOR information is exciting! very complete. For example, INFOR=ERV3, 1,7525,7525.
Sorry, with short reads you may only be able to get the boundary sequences if the inserted TE is long. You may consider experimentally validating it in some way or long reads seq.
Hi Xun,
I used IGV to check the breakpoints, and extracted the reads mapped within 150bp from TSD, assembled and compared them.(samtools)
Is there a way to assemble a more complete ERV, via discordant reads or split reads? Because I saw in the VCF file,INFO column,INFOR information is exciting! very complete. For example, INFOR=ERV3, 1,7525,7525.
Best, Dan