xunchen85 / ERVcaller

ERVcaller is a tool designed to accurately detect and genotype non-reference unfixed endogenous retroviruses (ERVs) and other transposable elements (TEs) in the human genome using next-generation sequencing (NGS) data. We evaluated the tools using both simulated and real benchmark whole-genome sequencing (WGS) datasets. ERVcaller is capable to accurately detect various TE insertions of any lengths, particularly ERVs. It allows for the use of a TE reference library regardless of sequence complexity, such as the entire RepBase database. It is easy to install and use with command lines.
http://www.uvm.edu/genomics/software/ERVcaller.html
14 stars 4 forks source link

Some chromosomes don't harbour any variation #25

Open rain-zjg opened 9 months ago

rain-zjg commented 9 months ago

Hello, Dr. Xun Chen, My target plant species has 26 chromosomes, the genome size of which is about 2.2 Gb. We name the chromosomes like chr01, chr02, chr03...chr24, chr25, chr26. I have run the whole pipeline for 139 individuals, however, I found that the final vcf file doesn't have calls for chromosomes form chr01 to chr09. Are there any naming rules for chromosome ID, or is this software useful for other species like plants? All the best, Rain Xshot-1715

xunchen85 commented 9 months ago

Hi, I am sorry about that. As we described in the README, the current ERVcaller so far only supports human chromosome names, chr1 to chr22, chrX, chrY, chrMT. You may need to replace chr01 to chr1, same for others.

If you are working on plants with the same coordinates, it should work too. Otherwise, you could change your chromosome names for running it. Let me know if you need further help. I could also revise the script for you!

Best, Xun

rain-zjg commented 9 months ago

Thanks for your answer. I had checked my final vcf file again, variation calls for chromosomes form chr23 to chr26 are missing, too. I can rename the chromosome names for chr01 to chr09, but how about chr23 to chr26? The numbers are larger than 22. Is a further improverd ERVcaller version capable of dealing with chromosome names which are not limited to human? I had tested dozens of softwares to call TIPs, however, only few generate vcf files as outputs. Vcf files are easy to handle for downstream population genetic analyses, if ERVcaller were able to detect TIPs for other species which have different naming rules, it would be very useful. Xshot-1716

xunchen85 commented 9 months ago

Sure, I just updated the script "Order_by_TE_sequence.pl" to support the chromosome names you have. You don't need to change the chromosome IDs.

You could download it here and replace your previous one: https://github.com/xunchen85/ERVcaller/blob/v1.4/Scripts/Order_by_TE_sequence.pl.

Let me know if there are issues.

Xun

1577377232 commented 6 months ago

Hello, Dr. Xun Chen, I would like to ask how to use Order_by_TE_sequence.pl? The species I study is chickens, 1 to 39, Z, W, MT. How can I modify the chromosome ID? Dan

xunchen85 commented 5 months ago

Hi Dan,

Sorry for the late reply. You could add the additional chromosomes in the Order_by_TE_sequence.pl script use a similar way as other chromosomes. Let me know if you still have the problem.

ps, I may also revise the script to support any random chromosome IDs that will support all species.

Xun