Pfam-A.hmm.h3i -- not downloaded

infinity01 commented 3 years ago

Hello!

I am having issues running the "transdecoder_hmmer" step.

Do you know how to regenerate the hmmr DB files?

It appears they exist, but the .h3i file is empty: -rw-r--rw- 1 root root 1459135873 Mar 8 17:44 Pfam-A.hmm -rw-rw-r-- 1 root root 57253888 Mar 9 12:11 Pfam-A.hmm.h3f -rw-rw-r-- 1 root root 0 Mar 9 12:11 Pfam-A.hmm.h3i -rw-rw-r-- 1 root root 105185280 Mar 9 12:11 Pfam-A.hmm.h3m -rw-rw-r-- 1 root root 123682816 Mar 9 12:11 Pfam-A.hmm.h3p

Thank you in advance!

Something went wrong. Check error message below and/or log files.
Error executing process > 'transdecoder_hmmer (Cprol)'

Caused by:
  Process `transdecoder_hmmer (Cprol)` terminated with an error exit status (1)

Command executed:

  echo -e "\n-- Starting HMMER --\n"

  hmmscan --cpu 8 --domtblout Cprol.pfam.domtblout /cm/shared/apps/TransPi/DBs/hmmerdb/Pfam-A.hmm Cprol.longest_orfs.pep

  echo -e "\n-- Done with HMMER --\n"

Command exit status:
  1

Command output:

  -- Starting HMMER --

Command error:

  Error: File format problem, trying to open HMM file /cm/shared/apps/TransPi/DBs/hmmerdb/Pfam-A.hmm.
  Opened /cm/shared/apps/TransPi/DBs/hmmerdb/Pfam-A.hmm.h3m, a pressed HMM file; but format of its .h3i file unrecognized

Work dir:
  /TransPi_files/work/11/5e810cef8a3a0929cdd99789408a73

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

rivera10 commented 3 years ago

Hello @infinity01,

Looks like the process hmmer_db did not build the hmmer database correctly. What was the output of that process hmmer_db? I guess it was ok since the pipeline continue to the other steps and did not stop.

There are two ways to solve this depending if you are using conda or containers.

If using the conda TransPi environment:

Go to the DBs directory (i.e. /cm/shared/apps/TransPi/DBs/hmmerdb/) and run the following line rm Pfam-A.hmm.* && hmmpress Pfam-A.hmm. For this to work you need to have the env activated (conda activate TransPi).

If using containers:

Remove the work directory for the hmmer_db process. This will make nextflow to redo the process hmmer_db again. To do this go to the directory where you ran TransPi, this directory should have the results and work directories (unless you used other names when running the pipeline). Once there use the following:
```
rm -rfi  /TransPi_files/work/$(cat /TransPi_files/results/pipeline_info/transpi_trace.txt | grep "hmmer_db" | awk '{print $2}')*/
```
NOTE: line above assumes you have the results and work in the directory /TransPi_files.

After you do one of the above then resume the pipeline execution by adding -resume when calling TransPi.

Let me know how it goes.

Best, Ramon

rivera10 commented 3 years ago

I tried to replicate the issue by running the tool multiple times and the hmmer database is created with no problem. However, for a future release I will have the transdecoder_hmmer process check if the DB was created successfully.

infinity01 commented 3 years ago

Thank you !! That seemed to fix it for hmmr DB but now its giving me an error for diamond ?

Error executing process > 'transdecoder_diamond (Cprol)'

Caused by:
  Process `transdecoder_diamond (Cprol)` terminated with an error exit status (1)

Command executed:

  unidb=/cm/shared/apps/TransPi//DBs/diamonddb_custom/uniprot_metazoa_33208.fasta

  echo -e "\n-- Starting Diamond (blastp) --\n"

  diamond blastp -d $unidb -q Cprol.longest_orfs.pep -p 8 -f 6 -k 1 -e 0.00001 >Cprol.diamond_blastp.outfmt6

  echo -e "\n-- Done with Diamond (blastp) --\n"

Command exit status:
  1

Command output:

  -- Starting Diamond (blastp) --

Command error:
  diamond v0.9.30.131 (C) Max Planck Society for the Advancement of Science
  Documentation, support and updates available at http://www.diamondsearch.org

  Error: Incomplete database file. Database building did not complete successfully.

Work dir:
  /TransPi_files/work/c2/c857d2f9ce780b7139e8e2a2d4fa63

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

rivera10 commented 3 years ago

Hello,

It seems the same happened to the diamond DB. Very odd. I will add a check for this process too. For now you can do the same as for the hmmer DB.

Either:

Go to the DBs directory (i.e. /cm/shared/apps/TransPi//DBs/diamonddb_custom/) and run the following line:
```
rm *.dmnd && diamond makedb --in uniprot_metazoa_33208.fasta -d uniprot_metazoa_33208.fasta
```
For this to work you need to have the env activated (conda activate TransPi). This assumes that you are using the uniprot_metazoa proteins. If not change accordingly.

Or remove the process work directory.

rm -rfi  /TransPi_files/work/$(cat /TransPi_files/results/pipeline_info/transpi_trace.txt | grep "custom_diamond_db" | awk '{print $2}')*/

Best, Ramon

rivera10 commented 3 years ago

I fixed this on the dev branch. See f7ff9e5. Let me know if you need further help.

PalMuc / TransPi

Pfam-A.hmm.h3i -- not downloaded #8