xunchen85 / ERVcaller

ERVcaller is a tool designed to accurately detect and genotype non-reference unfixed endogenous retroviruses (ERVs) and other transposable elements (TEs) in the human genome using next-generation sequencing (NGS) data. We evaluated the tools using both simulated and real benchmark whole-genome sequencing (WGS) datasets. ERVcaller is capable to accurately detect various TE insertions of any lengths, particularly ERVs. It allows for the use of a TE reference library regardless of sequence complexity, such as the entire RepBase database. It is easy to install and use with command lines.
http://www.uvm.edu/genomics/software/ERVcaller.html
14 stars 4 forks source link

Problem in Step 2 #17

Open ceromova opened 2 years ago

ceromova commented 2 years ago

Hello,

I runned ERVcaller and I found this problem:

Step 2: Detecting TE insertions...


~~~~~ the input bam file was indexed
sh: extractSoftclipped: command not found

What shoul I do?
Thank you.
xunchen85 commented 2 years ago

Hi, you may need to first compile the SE-MEI tool under the Scripts folder (tar vxzf SE-MEI-master.tar.gz & cd SE-MEI & make) and then use "export" command to add it to the $PATH.

Thanks, Xun

ceromova commented 2 years ago

Ok. Solved. Thank you. Another problem showed up after that and it is the following one: [E::bwa_idx_load_from_disk] fail to locate the index files

I aligned the sequenes with BWA and the files of the indexation are: .fna, .amb, .ann, .bwt, .fai, .pac, .sa

What could be the problem in here?

xunchen85 commented 2 years ago

it could due to the empty output. ERVcaller did a reciprocal alignment, if the output is empty the next alignment will fail and show this error.

You could show the file sizes of your input, ref (especially the chr IDs), and generated outputs. Maybe I could help to double check.

Thanks, Xun

ceromova commented 2 years ago

Hi,

About the error [E::bwa_idx_load_from_disk] fail to locate the index files

I downloaded the ref from here: GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz

The file sizes of my input are 40Gb.

The generated outputs are an empty VCF file and a temp folder with the following information:

image

xunchen85 commented 2 years ago

It looks like the error happened at an earlier time than the reciprocal alignment. The human reference file is okay.

Maybe you could show the full log information till the error for the debugging? It also depends on the ERV references that you are using, for example, no TE insertions of your ERV reference were detected.

Have you successfully run the test data yet which could confirm your successful installation of ERVcaller?

Xun