Closed mavericb closed 3 months ago
There are instructions here: https://github.com/sokrypton/ColabFold/tree/main/MsaServer
On how to set it up correctly
There are instructions here: https://github.com/sokrypton/ColabFold/tree/main/MsaServer
On how to set it up correctly
Sorry to bother you again. I see a setup_databases.sh in the main folder, and another setup-and-start-local.sh in the MsaServer. I already ran succesfully the setup_databases.sh, and now I tried to run the setup-and-start-local.sh but got the error "PDB rsync server was not chosen, please edit this script to choose which PDB download server you want to use".
I think it would be very helpful to write step-by-step instructions in the README on how to use ColaFold with a local MsaServer, possibly with additional explanation for running the MsaServer when using ColaFold via Docker.
I'm very confused now and don't know how to proceed further :(
setup_database.sh and msaserver/setup-and-start-local.sh seem different. So, the plan is to use msaserver/setup-and-start-local.sh and hopefully, the server will be up for working with the local fold Docker image.
I had to uncomment a line to select the PDB server:
PDB_SERVER=rsync.wwpdb.org::ftp # RCSB PDB server name
PDB_PORT=33444
but I'm not sure if this is the right thing to do.
And then, I had to install Go and Aria via apt-get install
.
Now it's downloading a 95 GB file. I'm not sure if I have already downloaded that during the setup_database.sh process:
*** Download Progress Summary as of Thu Jul 18 20:59:13 2024 ***
===================================================================================================================
[#3ec324 9.0GiB/95GiB(9%) CN:5 DL:10MiB ETA:2h15m4s]
FILE: ./uniref30_2302.tar.gz
-------------------------------------------------------------------------------------------------------------------
[#3ec324 9.3GiB/95GiB(9%) CN:5 DL:11MiB ETA:2h13m38s]
I cloned a new repository and followed the instructions here: https://github.com/sokrypton/ColabFold/tree/main/MsaServer.
However, I encountered two problems:
2024/07/19 10:01:05 open ~/databases/pdb70_a3m.ffdata: no such file or directory111
.The instructions claim that "The script can be called repeatedly to start the server. It will avoid doing any unnecessary setup work." However, when I call the script again, I get the error:
~/amelie/Workspace/ColabFold/MsaServer/mmseqs-server ~/amelie/Workspace/ColabFold/MsaServer
You are not currently on a branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.
git pull <remote> <branch>
:(
Hmmm, maybe it's the config.json that is outdated. I see pdb70 there, but in the downloaded files I have pdb100, same with UniRef. So I am trying to update the config.json to match the downloaded files
I used the fork from this guy and now it's working: https://github.com/sokrypton/ColabFold/pull/534. But new errors have appeared...
File "/usr/local/envs/colabfold/lib/python3.9/site-packages/colabfold/colabfold.py", line 209, in run_mmseqs2
raise Exception(f'MMseqs2 API is giving errors. Please confirm your input is a valid protein sequence. If error persists, please try again an hour later.')
Exception: MMseqs2 API is giving errors. Please confirm your input is a valid protein sequence. If error persists, please try again an hour later.
2024-07-19 19:30:23,169 Query 10/10: run-1_102__id_10__T_0.05__seed_111__overall_confidence_0.2730__ligand_confidence_0.2730__seq_rec_0.0412 (length 257)
2024-07-19 19:30:23,170 Server didn't reply with json: 404 page not found
Hi @mavericb
Thanks a lot for the detailed blog
https://www.blopig.com/blog/2024/04/dockerized-colabfold-for-large-scale-batch-predictions/
A newbie question here: I happened to see this from @YoshitakaMo for localcolabfold
where --use-env 1 --use-templates 1 --db2 pdb100_230517
is used with colabfold_search
but the same args/parameters are not used in your search with colabfold_search
.
MMSEQS_PATH="/path/to/your/mmseqs2/for_colabfold"
DATABASE_PATH="/mnt/databases"
INPUTFILE="ras_raf.fasta"
OUTPUTDIR="ras_raf"
colabfold_search \
--use-env 1 \
--use-templates 1 \
--db-load-mode 2 \
--db2 pdb100_230517 \
--mmseqs ${MMSEQS_PATH}/bin/mmseqs \
--threads 4 \
${INPUTFILE} \
${DATABASE_PATH} \
${OUTPUTDIR}
Appreciate your inputs and help here.
Thanks in advance.
I'm trying to use a local MMseqs2 server with ColabFold running in a Docker container. However, I'm encountering several issues:
It's not clear if ColabFold is using the default server or a local one. How can I verify this and ensure it's using a local server?
I tried setting up a local server using https://github.com/soedinglab/MMseqs2-App, but I'm getting MMseqs2 API errors (see attached image ).
When attempting to set up the database using the setup_database.sh script, I get the error:
Any guidance on configuring the Docker environment to work with a local MMseqs2 server would be greatly appreciated.
Thanks!!