Dfam-consortium / RepeatModeler

De-Novo Repeat Discovery Tool
Other
184 stars 23 forks source link

Errors with the -LTRStruc parameter #74

Open oushujun opened 4 years ago

oushujun commented 4 years ago

@jebrosen So I rerun the conda RepeatModeler with -debug with the following command:

/usr/bin/time -v RepeatModeler -engine ncbi -pa $threads -database $genome -LTRStruct -ninja_dir /home/oushujun/las/bin/NINJA/NINJA/ -debug

I received these errors in the LTRStruc step:

Clustering...LTRPipeline::runNinja : tmpdir = .../RM_124716.SunMar291508572020/LTR_252316.MonMar301112332020/NINJA_252316.MonMar301221092020 LTRPipeline::runNinja : Running analysis /home/oushujun/las/bin/miniconda2/envs/EDTA/bin/Ninja --in .../RM_124716.SunMar291508572020/LTR_252316.MonMar301112332020/mafft-alignment.fa --out .../RM_124716.SunMar291508572020/LTR_252316.MonMar301112332020/NINJA_252316.MonMar301221092020/cluster.dat --out_type c --corr_type m --cluster_cutoff 0.2 --threads 36 > .../RM_124716.SunMar291508572020/LTR_252316.MonMar301112332020/NINJA_252316.MonMar301221092020/Ninja.log 2>&1 LTRPipeline: Error - could not cluster MAFFT results. : 00:00:00 (hh:mm:ss) Elapsed Time LTRPipeline : Error - could not open .../RM_124716.SunMar291508572020/LTR_252316.MonMar301112332020/clusters.dat! at /home/oushujun/las/bin/miniconda2/envs/EDTA/share/RepeatModeler/LTRPipeline line 325.

I checked the log file NINJA_<num>.<date>/Ninja.log and it has the following error msg:

sh: /home/oushujun/las/bin/miniconda2/envs/EDTA/bin/Ninja: No such file or directory

It seems that RepeatModeler is not picking up the provided NINJA path, can you help to check?

Shujun

Originally posted by @oushujun in https://github.com/Dfam-consortium/RepeatModeler/issues/62#issuecomment-606153508

jebrosen commented 4 years ago

I misread what your issue was before - in multiple ways. Sorry about that...

I believe this is actually a bug. It looks like RepeatModeler does not pass command-line path overrides down to LTRPipeline.

As a workaround, an environment variable ought to work properly: NINJA_DIR=/home/oushujun/las/bin/NINJA/NINJA /usr/bin/time -v RepeatModeler -engine ncbi -pa $threads -database $genome -LTRStruct -debug

oushujun commented 4 years ago

Seems like this is an independent issue so I open this new thread. Since the conda RepeatMasker error I encountered and described above, I installed the full version (v2.0.1) of RepeatMasker with the configure script. I re-executed the full pipeline and still encounter the same Ninja error:

Clustering...LTRPipeline::runNinja : tmpdir = .../RepeatModeler/RM_163533.MonM ar301322402020/LTR_53177.TueMar311844022020/NINJA_53177.TueMar311953202020 LTRPipeline::runNinja : Running analysis /home/oushujun/las/bin/NINJA/NINJA//Ninja --in .../RepeatModeler/RM_163533.MonMar301322402020/LTR_53177.TueMar311844022020/mafft-alignment.fa --out .../RepeatModeler/RM_163533.MonMar301322402020/LTR_53177.TueMar311844022020/NINJA_53177.TueMar311953202020/cl uster.dat --out_type c --corr_type m --cluster_cutoff 0.2 --threads 36 > .../R epeatModeler/RM_163533.MonMar301322402020/LTR_53177.TueMar311844022020/NINJA_53177.TueMar311953202020/Ninja.log 2>&1 LTRPipeline: Error - could not cluster MAFFT results. : 00:00:00 (hh:mm:ss) Elapsed Time

Except for this time the Ninja error log shows:

sh: /home/oushujun/las/bin/NINJA/NINJA//Ninja: cannot execute binary file

There is a double "/" in the command but I assume this would not affect the execution of this program. Then I check the precompiled Ninja program, it could not be executed in my Linux system...

oushujun commented 4 years ago

So I think you are right, the -ninja_dir parameter is not passed down to the LTRPipeline.

jebrosen commented 4 years ago

Then I check the precompiled Ninja program, it could not be executed in my Linux system...

Where did you get a precompiled Ninja program? Those might have been accidentally committed to the repository at one point but I thought those had been cleaned up by now.

oushujun commented 4 years ago

It was from the NINJA repository, but I think I didn't use the NINJA 0.95-cluster_only version. I will try that.

oushujun commented 4 years ago

There are executables called Ninja_new and Ninja_old in the NINJA 0.95-cluster_only package, both reported errors like these:

./Ninja_new: /lib64/libstdc++.so.6: version GLIBCXX_3.4.20' not found (required by ./Ninja_new) ./Ninja_new: /lib64/libstdc++.so.6: versionCXXABI_1.3.8' not found (required by ./Ninja_new) ./Ninja_new: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./Ninja_new)

I could recompile it with make all but only got a program named Ninja, which seems executable. Should I use this for RepeatModeler?

jebrosen commented 4 years ago

Yes - Ninja built by make is the correct one. Sorry about that, I really thought those were cleaned up but there are more leftover binaries in there than I realized.

arborhys commented 3 years ago

I had this same issue. A more permanent workaround for me was creating a symlink to Ninja in the directory where the pipeline expected it.

lczqd commented 3 years ago

I got the same issue.

It seems that ninjia did not produce a clusters.dat file as RepeatModeler expected, but ninjia did produce something: -rw-rw-r-- 1 lcz lcz 6612177 Aug 14 17:23 LtrRetriever-redundant-results.fa -rw-rw-r-- 1 lcz lcz 275719074 Aug 14 18:17 mafft-alignment.fa -rw-rw-r-- 1 lcz lcz 2847322 Aug 14 16:22 raw-struct-results.txt

So I guess it is something about the version of ninjia? The ninjia I used was the precompiled NINJA-0.95-cluster_only/Ninja_new which I made a soft link as Ninjia.

BioFalcon commented 3 years ago

@lczqd those are not outputs of ninja. raw-struct-results.txt is a product of LTRHarvest, LtrRetriever-redundant-results.fa of LTR_Retriever and mafft-alignment.fa a product of MAFFT. What does your log say?

caonetto commented 3 years ago

Im having the same issue, the pipeline is failing in lustering...LTRPipeline::runNinja : tmpdir = /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021 LTRPipeline::runNinja : Running analysis /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/miniconda3/envs/repeatmodeler/bin/Ninja --in /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/mafft-alignment.fa --out /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021/cluster.dat --out_type c --corr_type m --cluster_cutoff 0.2 --threads 70 > /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021/Ninja.log 2>&1 LTRPipeline: Error - could not cluster MAFFT results. : 00:00:00 (hh:mm:ss) Elapsed Time LTRPipeline : Error - could not open /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/clusters.dat! at /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/miniconda3/envs/repeatmodeler/share/RepeatModeler/LTRPipeline line 325. Any ideas? The Ninja file seems to be working ok.

jebrosen commented 3 years ago

The ninjia I used was the precompiled NINJA-0.95-cluster_only/Ninja_new which I made a soft link as Ninjia.

@lczqd Please see the previous comments about this exact problem: https://github.com/Dfam-consortium/RepeatModeler/issues/74#issuecomment-608022714. The precompiled programs have never been updated and were only included by accident; in the latest release https://github.com/TravisWheelerLab/NINJA/releases/tag/0.98-cluster_only these have been removed to avoid the confusion in the future.

lczqd commented 3 years ago

The ninjia I used was the precompiled NINJA-0.95-cluster_only/Ninja_new which I made a soft link as Ninjia.

@lczqd Please see the previous comments about this exact problem: #74 (comment). The precompiled programs have never been updated and were only included by accident; in the latest release https://github.com/TravisWheelerLab/NINJA/releases/tag/0.98-cluster_only these have been removed to avoid the confusion in the future.

Thanks. That was what I did. I compiled ninja from source code and the problem was solved.

lczqd commented 3 years ago

clusters.dat Im having the same issue, the pipeline is failing in lustering...LTRPipeline::runNinja : tmpdir = /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021 LTRPipeline::runNinja : Running analysis /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/miniconda3/envs/repeatmodeler/bin/Ninja --in /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/mafft-alignment.fa --out /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021/cluster.dat --out_type c --corr_type m --cluster_cutoff 0.2 --threads 70 > /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/NINJA_1415401.FriAug200354262021/Ninja.log 2>&1 LTRPipeline: Error - could not cluster MAFFT results. : 00:00:00 (hh:mm:ss) Elapsed Time LTRPipeline : Error - could not open /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/work/eutypa_lata/repeatmodeler/RM_1378244.ThuAug191026162021/LTR_1415401.FriAug200351142021/clusters.dat! at /mnt/e70c0819-464b-4569-bbb6-cbc208b1daa1/cris/miniconda3/envs/repeatmodeler/share/RepeatModeler/LTRPipeline line 325. Any ideas? The Ninja file seems to be working ok.

Compile ninja from source code will solve the problem. The pre-compiled ninja is outdated.

deoliveira86 commented 1 year ago

Hello,

I am facing the same issue. I compiled Ninja from the source and still I am having a segmentation fault issue.

NINJA-0.95-cluster_only/NINJA/Ninja --in mafft-alignment.fa --out cluster.dat --out_type c --corr_type m --cluster_cutoff 0.2 --cluster_cutoff 0.2
Reading...
Segmentation fault

Any ideas how to solve this issue? Best, André

rmhubley commented 1 year ago

@deoliveira86 Could you share the failing mafft-alignment.fa file that is causing this failure?