soedinglab / hh-suite

Remote protein homology detection suite.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3019-7
GNU General Public License v3.0
544 stars 134 forks source link

Update links to databases in README #75

Closed croth1 closed 5 years ago

croth1 commented 7 years ago

From https://github.com/soedinglab/hh-suite/commit/c3567c8b8270ae89584f43d038b2bb40946002d7#commitcomment-25435936

Can we please get an updated README?

The current link to download pdb70 doesn't work.

If accessing http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/ there >are no pdb70.a3m.tar.gz, pdb70.hhm.tar.gz, pdb70*.a3m.tar.gz

Thanks in advance!

cc: @ppflrs

kad-ecoli commented 6 years ago

I have the same question. To run hhpred.pl we additionally need the pdb files, which I cannot find within the new pdb70_from_mmcif database files. On the other hand, the old pdb70 database files seems to be stop updating after last April.

croth1 commented 6 years ago

ping @milot-mirdita

erlog commented 6 years ago

As well, it would be nice if there were symlinks to the latest UniClust data to make it easier to write scripts to automatically update the local copies of the database. This exists for pdb70 but not for uniclust* or uniboost.

milot-mirdita commented 5 years ago

We have updated the database paths in the wiki.

HHpred is currently unsupported and will remain so for the foreseeable future, since our request for funding for the HH-suite was denied. Sorry about that.

Uniclust will get a better download links with the next release. See https://github.com/soedinglab/uniclust-pipeline/issues/9

gnmcsbnfrmtcsclb commented 5 years ago

From which links can pdb70.a3m.tar.gz, and pdb70.hhm.tar.gz be downloaded? Thanks!

Here - https://github.com/soedinglab/hh-suite/wiki, you have provided scripts to generate the databases

However, I am interested in using a pre-made PDB70 database that ALSO contains a3m and hhm files.

Can you please provide link(s) so I can download and use them? It's OK if they are a little older....

Please note that posts here and elsewhere (e.g. https://www.biostars.org/p/377767/) have been requesting this info. So if you can help, it will help not just me, but a lot of other users as well. Thanks again!

martin-steinegger commented 5 years ago

The a3m and hhm are in the pdb70_from_mmcif_latest.tar.gz file on our web server http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/

gnmcsbnfrmtcsclb commented 5 years ago

Thanks! The listing now reads after gunzip and tar step as follows:

md5sum
pdb70_a3m.ffdata
pdb70_a3m.ffindex
pdb70_hhm.ffdata
pdb70_hhm.ffindex
pdb70_cs219.ffdata
pdb70_cs219.ffindex
pdb70.cs219
pdb70.cs219.sizes
pdb70_a3m_db.index
pdb70_hhm_db.index
pdb70_a3m_db
pdb70_hhm_db
pdb_filter.dat
pdb70_clu.tsv

Do I now have all the files for running HHPRED in the PDB70 folder? It doesn't look like it Because regardless of how many times I've downloaded that gzip file, I always get this final error line with hhpref.pl - 07:55:02.514 ERROR: Could find neither hhm_db nor a3m_db!

Complete STDOUT for my hhpred attempt is shown below.

With this as context, could you please revise your answer accordingly? Like I said before, so many researchers wanting to use the hhpred pipeline would benefit from your clarification, not just me :)

I have seen a link here that says hhpred was / is not funded and therefore not supported. But all I am requesting is info on where the hhm_db and a3m_db are in the expected hhpred database :)

Thanks a lot!

hhpred.pl -i sp.Q9ZR12.GRH1_ARATH.a3m -o sp.Q9ZR12.GRH1_ARATH.hhpred -d PDB70/

==========================================================
|                HHPRED structure predictor              |
==========================================================

mkdir -p /tmp/26591
mkdir -p /tmp/26591/sp.Q9ZR12.GRH1_ARATH
---------------------------------------------------------------------------------------------------------------------
HHpred configuration parameters:
---------------------------------------------------------------------------------------------------------------------
MDNWeightsLayer1CACA       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer1CACAminP.dat
MDNWeightsLayer1NO         => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer1NOminP.dat
MDNWeightsLayer1SCMC       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer1SCMCminP.dat
MDNWeightsLayer1SCSC       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer1SCSCminP.dat
MDNWeightsLayer2CACA       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer2CACAminP.dat
MDNWeightsLayer2NO         => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer2NOminP.dat
MDNWeightsLayer2SCMC       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer2SCMCminP.dat
MDNWeightsLayer2SCSC       => /share/apps/hhsuite-3.2.0//scripts/hhpred/share/neural-net/MDNWeightsLayer2SCSCminP.dat
TMalign                    => /share/apps/hhsuite-3.2.0//scripts/hhpred/bin/TMalign
TMscore                    => /share/apps/hhsuite-3.2.0//scripts/hhpred/bin/TMscore
addss                      => /share/apps/hhsuite-3.2.0//scripts/addss.pl
assessModel                => 1
cpus                       => 4
doFiltering                => 1
doParallelModeller         => 0
hhalign                    => /share/apps/hhsuite-3.2.0//bin/hhalign
hhblits                    => /share/apps/hhsuite-3.2.0//bin/hhblits
hhblits_mact               => 0.5
hhblits_rounds             => 3
hhfilter                   => /share/apps/hhsuite-3.2.0//bin/hhfilter
hhlib                      => /share/apps/hhsuite-3.2.0/
hhmake                     => /share/apps/hhsuite-3.2.0//bin/hhmake
hhmakemodel                => /share/apps/hhsuite-3.2.0//scripts/hhpred/dependencies/hhmakemodel.pl
hhsearch                   => /share/apps/hhsuite-3.2.0//bin/hhsearch
hhsearch_mact              => 0.05
maxNumOfTemplates          => 8
modeller                   => /share/apps/hhsuite-3.2.0//scripts/hhpred/bin/modeller9.13/bin/modpy.sh python2.7
modellerParallel           => /share/apps/hhsuite-3.2.0//scripts/hhpred/bin/modeller9.13/bin/modpy.sh python2.7 
multiTemplate              => 1
multithread                => /share/apps/hhsuite-3.2.0//scripts/multithread.pl
numberOfGeneratedModels    => 3
parallelFiltering          => 0
pdbdir                     => /share/apps/hhsuite-3.2.0/databases
preselectTemplates         => 1
rankTemplates              => 1
realignProbcons            => 0
repairPDB                  => /share/apps/hhsuite-3.2.0//scripts/hhpred/bin/repair_pdb.pl
replaceDistanceRestraints  => 1
templateWeightStrategy     => 1
uniprot20                  => /share/apps/hhsuite-3.2.0/databases
---------------------------------------------------------------------------------------------------------------------

cp sp.Q9ZR12.GRH1_ARATH.a3m /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.a3m
/share/apps/hhsuite-3.2.0//bin/hhmake -i /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.a3m -o /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.hhm
- 07:55:01.118 INFO: /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.a3m is in A2M, A3M or FASTA format

- 07:55:01.470 WARNING: MSA sp.Q9ZR12.GRH1_ARATH looks too diverse (Neff=11.8515>11). Better check it with an alignment viewer for non-homologous segments. Also consider building the MSA with hhblits using the - option to limit MSA diversity.

HHLIB=/share/apps/hhsuite-3.2.0/ /share/apps/hhsuite-3.2.0//bin/hhsearch -i /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.hhm -d /share/apps/hhsuite-3.2.0/databases/db/pdb.hhm -mact 0.05 -cpu 4 -atab /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.start.tab
- 07:55:02.513 INFO: Search results will be written to /tmp/26591/sp.Q9ZR12.GRH1_ARATH/sp.Q9ZR12.GRH1_ARATHjMJgHgK.hhr

- 07:55:02.514 ERROR: Could find neither hhm_db nor a3m_db!
martin-steinegger commented 5 years ago

I do not have the knowledge how to setup HHpred. There is a working hhpred instance at the Toolkit in Tuebingen https://toolkit.tuebingen.mpg.de/#/tools/hhpred

gnmcsbnfrmtcsclb commented 5 years ago

Thank you Martin, but I need need something for command line implementation due to number of inputs to analyze. Is there anyone other than Johannes himself who you think would be in a position to answer where the elusive a3m file and hhm file for PDB70 are? :) I am not asking any question about setup, just how to address "Could find neither hhm_db nor a3m_db!"