Alexamk / RREFinder

Bioinfromatic application for the detection of RREs in protein sequences of interest
GNU Affero General Public License v3.0
7 stars 4 forks source link

Missing file in exploratory mode (Could not solve by installing HHsuite in the same conda environment) #6

Open deandz2 opened 3 years ago

deandz2 commented 3 years ago

Hi,

It seems that we have a directory issue, supposedly due to a missing file in exploratory mode. This problem might have shown up before in previous issue, but applying the suggested solution (which is installing HHsuite in the same conda environment) did not solve the problem.

Any suggestion is appreciated. Thank you very much. Here is the log.

(RREfinder) ~/RREFinder$ python2 RRE.py -v2 -t fasta -c 10 -i test.fasta -m exploratory abcxyz Warning! Output folder with name abcxyz already found - results may be overwritten Reading in file test.fasta Continuing with 6 queries Skipped 0 genes Rewriting fasta Resubmitting 3 found RREs hhblits -cpu 10 -d data/database/RRE_v5_iter_3 -i output/abcxyz/fastas/NonB_RRE.fasta -oa3m output/abcxyz/fastas/NonB_RRE_expalign.a3m -o output/abcxyz/fastas/NonB_RRE_expalign.hhr -v 0 -n 3 addss.pl output/abcxyz/fastas/NonB_RRE_expalign.a3m output/abcxyz/fastas/NonB_RRE_expalign_ss.a3m -a3m Traceback (most recent call last): File "RRE.py", line 1298, in res,parsed_data_dict = main(settings) File "RRE.py", line 1129, in main all_groups = rrefinder_main(settings,RRE_targets,all_groups) File "RRE.py", line 896, in rrefinder_main resubmit_all(all_groups,RRE_targets,settings) File "RRE.py", line 520, in resubmit_all resubmit_group(group,RRE_targets,settings,settings.cores) File "RRE.py", line 461, in resubmit_group add_ss(group,settings,resubmit=True) File "RRE.py", line 258, in add_ss p = Popen(cmds,stdout=PIPE,stderr=PIPE) File "/usr/lib/python2.7/subprocess.py", line 394, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

Alexamk commented 3 years ago

Hi, It seems to be crashing when trying to run the addss.pl script. This is part of the HHSuite set of tools. Can you check if just running the script in the same environment as in which you run RREFinder can find it?

deandz2 commented 3 years ago

Hi Alex,

Thanks for your help. I can get results now. The issue was that the PATH was not connected to the right hhsuite folder (build/scripts), and having hhsuite package in the conda environment did not help at all for unclear reason.

However, the log says there were some error (Error 1 and Error 255 of psipred), I think particularly during psipred step and the share/psipred_4.01/data folder (Cannot open weight files)

What I do not really understand is how I was able to still get results (RRE predicted with both initial submission and resubmission), and whether the error affect any aspects of the results.

I copied the log here. Thank you very much.

Reading in file test.fasta Continuing with 1 queries Skipped 0 genes Rewriting fasta hmmsearch --cpu 10 -o output/yolo7d/results/RREfinder_hmm_results.txt --domtblout output/yolo7d/results/RREfinder_hmm_results.tbl -T 15 data/hmm/RRE_phmms_3_iter.hmm output/yolo7d/fastas/fasta_all.fasta Resubmitting 1 found RREs hhblits -cpu 10 -d data/database/RRE_v5_iter_3 -i output/yolo7d/fastas/RosB_RRE.fasta -oa3m output/yolo7d/fastas/RosB_RRE_expalign.a3m -o output/yolo7d/fastas/RosB_RRE_expalign.hhr -v 0 -n 3 addss.pl output/yolo7d/fastas/RosB_RRE_expalign.a3m output/yolo7d/fastas/RosB_RRE_expalign_ss.a3m -a3m

$ cp output/yolo7d/fastas/RosB_RRE_expalign.a3m /tmp/kzxkue8ebV/zK8tbrxDlV.1.in.a3m Filtering alignment to diversity 7 ... $ hhfilter -v 1 -neff 7 -i /tmp/kzxkue8ebV/zK8tbrxDlV.in.a3m -o /tmp/kzxkue8ebV/zK8tbrxDlV.in.a3m $ /home/wav/hh-suite/build/scripts/reformat.pl -v 1 -r -noss a3m psi /tmp/kzxkue8ebV/zK8tbrxDlV.in.a3m /tmp/kzxkue8ebV/zK8tbrxDlV.in.psi Predicting secondary structure with PSIPRED ... $ /home/wav/miniconda3/envs/RREfinder/bin/blastpgp -b 1 -j 1 -h 0.001 -d /home/wav/hh-suite/build/data/do_not_delete -i /tmp/kzxkue8ebV/zK8tbrxDlV.sq -B /tmp/kzxkue8ebV/zK8tbrxDlV.in.psi -C /tmp/kzxkue8ebV/zK8tbrxDlV.chk 1> /tmp/kzxkue8ebV/zK8tbrxDlV.blalog 2> /tmp/kzxkue8ebV/zK8tbrxDlV.blalog $ echo zK8tbrxDlV.chk > /tmp/kzxkue8ebV/zK8tbrxDlV.pn

$ echo zK8tbrxDlV.sq > /tmp/kzxkue8ebV/zK8tbrxDlV.sn

$ /home/wav/miniconda3/envs/RREfinder/bin/makemat -P /tmp/kzxkue8ebV/zK8tbrxDlV $ /home/wav/miniconda3/envs/RREfinder/bin/psipred /tmp/kzxkue8ebV/zK8tbrxDlV.mtx /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat2 /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat3 > /tmp/kzxkue8ebV/zK8tbrxDlV.ss Cannot open weights file!

Error: command '/home/wav/miniconda3/envs/RREfinder/bin/psipred /tmp/kzxkue8ebV/zK8tbrxDlV.mtx /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat2 /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights.dat3 > /tmp/kzxkue8ebV/zK8tbrxDlV.ss' returned error code 255

$ /home/wav/miniconda3/envs/RREfinder/bin/psipass2 /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights_p2.dat 1 0.98 1.09 /tmp/kzxkue8ebV/zK8tbrxDlV.ss2 /tmp/kzxkue8ebV/zK8tbrxDlV.ss > /tmp/kzxkue8ebV/zK8tbrxDlV.horiz Cannot open weight file!

Error: command '/home/wav/miniconda3/envs/RREfinder/bin/psipass2 /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01/weights_p2.dat 1 0.98 1.09 /tmp/kzxkue8ebV/zK8tbrxDlV.ss2 /tmp/kzxkue8ebV/zK8tbrxDlV.ss > /tmp/kzxkue8ebV/zK8tbrxDlV.horiz' returned error code 1

done

hhsearch -cpu 10 -d data/database/RRE_short -i output/yolo7d/fastas/RosB_RRE_expalign_ss.a3m -o output/yolo7d/results/RosB_RRE.hhr -v 0 Parsing results Found 1 RRE hits Max regs found: 0 Max regs found: 0 Finished. Total time: 16.27 seconds (on 10 cores) RREfinder hits found: 1 out of 1 RREfinder resubmit hits found: 1 out of 1

Alexamk commented 3 years ago

That's strange. Can you check if the weight files it looks for are in the folder you specified? It looks for weights.dat, weights.dat2 and weights.dat3, by the looks from it in folder /home/wav/miniconda3/envs/RREfinder/share/psipred_4.01. There might be a /data missing at the end. Iirc, you might still get results, but the HHBlits scoring will not take into account secondary structure (which is what PSIPRED does). This actually affects the scoring a lot for these RREs. You can check whether or not it successfully integrated it by checking the first few lines of the HHblits output (the .a3m files). The files labelled with _ss.a3m should have a line specifying the secondary structure, while the ones without should not.