soedinglab / hh-suite

Remote protein homology detection suite.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3019-7
GNU General Public License v3.0
545 stars 134 forks source link

mkdssp error in hhblits #218

Open yusufzaferaydin opened 4 years ago

yusufzaferaydin commented 4 years ago

:exclamation: Make to check out our User Guide.

Expected Behavior

The -i flag of mkdssp should include a dssp filename

Current Behavior

The following error is produced in which the -i option is set to 1 instead of a pdb filename:

Error: command '/vol1/software/dssp/3.1.4/mkdssp -i 1 -o /tmp/MhEmNhP0FV/F7RKqm9AjW.dssp 2> /dev/null' returned error code 127

Steps to Reproduce (for bugs)

This error is produced by the latest version of HHsuite3 (by HHblits) from github.

HH-suite Output (for bugs)

Please make sure to post the complete output of the tool you called. Please use gist.github.com.

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

The latest version of HHblits and HHpred 128 GB RAM Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz Ubuntu 16.04.1 LTS

milot-mirdita commented 4 years ago

The last version I've tried was dssp-2.0.4-linux-amd64 that worked fine.

We can't give support for HHpred currently due to the lack of time and resources, as all grants for the development of the HH-suite were rejected.

yusufzaferaydin commented 4 years ago

It is probably not due to the version of dssp but because of the script in HHblits that prepares the input commandline for dssp. It works for some proteins but not for others. For instance if the fasta filename is 1fdx.fasta that contains

1fdx AYVINDSCIACGACKPECPVNIIQGSIYAIDADSCIDCGSCASVCPVGAPNPED

HHblits produces this error

Error: command '/vol1/apps/dssp/3.1.4/mkdssp -i 1 -o /tmp/XdIE49enjg/rLf13B_wHm.dssp 2> /dev/null' returned error code 1

I think in the above commandline -i should be followed by a pdb filename instead of 1.

However no error is produced if the fasta filename is 1aazb-1-DOMAK.fasta that contains

1aazb-1-DOMAK MFKVYGYDSNIHKCVYCDNAKRLLTVKKQPFEFINIMPEKGVFDDEKIAELLTKLGRDTQIGLTMPQVFAPDGSHIGGFDQLREYFK

Maybe it is related to the length of the protein ID. Thanks.

schdaude commented 4 years ago

Hi, we experienced a similar problem when running the scripts/addss.pl script and narrowed in down to the AppendDsspSequences function in scripts/addss.pl. It checks whether the sequence name matches the pattern for a SCOPe, SCOP, PDB or DALI identifier. If not, the function returns and doesn't add anything. 1aazb-1-DOMAK therefore just returns but 1fdx triggers the regular expression for PDB and proceeds.

Our guess is that when you run the script in a system where the paths specified in HHPaths.pm (pdbdir, dsspdir) are not present, you get the error message you described.

We didn't follow up any further as we only need the PSIPRED prediction and simply supressed the DSSP part by setting dssp in HHPaths.pm to "".

Maybe this helps