Closed drewaight closed 2 years ago
please follow the readme [here] (https://github.com/sokrypton/ColabFold#generating-msas-for-large-scale-structurecomplex-predictions) and copy the output folder from the server you used for search to one with a GPU you want to use for predictions. no need to set --host-url, if you run colabfold with a3m files as input it will use them and no server.
I get it now. Thanks!
Is it possible to create complexes by running mmseqs locally, either with the colabfold_search.sh shell script or with the search.py script? When I run colabfold_search.sh with the input fasta (chainA:chainB) just like the colabfold notebook, the output is not a properly paired complex, where as the notebook output is a perfect heterodimer. Thanks and sorry for my ignorance.
Drew
I believe I am having the same issue. it seems colabfold_search that produce the m3a files for colabfold doesn't support complexes, I just found out in the command help description
Complexes should be supported. I removed the message. I just ran a some example locally. Could you please post the full error message?
Ok I started over completely to make sure everything is newest.... heres what I did.
module load cuda/11.2.2 module load gcc/7.5.0 module load cmake/3.18.3 conda create -n colabfold python=3.7 conda activate colabfold pip install "colabfold[alphafold] @ git+https://github.com/sokrypton/ColabFold" pip install --upgrade "jax[cuda]<0.3.0" -f https://storage.googleapis.com/jax-releases/jax_releases.html cd colabfold git clone https://github.com/soedinglab/MMseqs2.git cd MMseqs2 mkdir build cd build cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=. .. make make install export PATH=$(pwd)/bin/:$PATH cd database ./setup_database.sh colabfold_search trast_colabinp.fasta database msas
Where the search input is
>trastuzumab DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSRSGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIK:EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS
Here is the error:
Traceback (most recent call last): File "/SFS/user/ry/waight/anaconda3/envs/colabfold/bin/colabfold_search", line 8, in
sys.exit(main()) File "/SFS/user/ry/waight/anaconda3/envs/colabfold/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 472, in main with args.base.joinpath(f"{id}.paired.a3m").open("r") as f: File "/SFS/user/ry/waight/anaconda3/envs/colabfold/lib/python3.7/pathlib.py", line 1208, in open opener=self._opener) File "/SFS/user/ry/waight/anaconda3/envs/colabfold/lib/python3.7/pathlib.py", line 1063, in _opener return self._accessor.open(self, flags, mode) FileNotFoundError: [Errno 2] No such file or directory: 'msas/1.paired.a3m'
Thanks for any help or insight you can provide!
Drew
Could you try running the search with --threads 1
?
I'm running into a very similar issue when submitting heterodimer sequences to mmseqs search locally:
colabfold_search --use-env=1 --use-templates=0 --db-load-mode=0 /app/input/abfd3e9562e06a036f79da967c9cdf1a.fasta /data/input/colabfold/ msas
Could not delete msas/0.paired.a3m!
Traceback (most recent call last):
File "/opt/conda/bin/colabfold_search", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 454, in main
threads=args.threads,
File "/opt/conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 313, in mmseqs_search_pair
".paired.a3m",
File "/opt/conda/lib/python3.7/site-packages/colabfold/mmseqs/search.py", line 23, in run_mmseqs
subprocess.check_call([mmseqs] + params)
File "/opt/conda/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '[PosixPath('mmseqs'), 'unpackdb', PosixPath('msas/pair.a3m'), PosixPath('msas'), '--unpack-name-mode', '0', '--unpack-suffix', '.paired.a3m']' returned non-zero exit status 1.
Homodimer search works completely fine since uniref30 taxonomy was added to input data.
@IvansJasjkoQB could you try --threads 1
please? I think pairaln
has some issues with multi-threading.
The commit https://github.com/soedinglab/MMseqs2/commit/407b315e7edcbc9eb73527b904172e603095494e of MMseqs2 should also allow multi-threading.
Can confirm that setting --threads 1
resolves the issue. Will submit another run with multi-threading from the latest commit. Thanks for looking into it :)
confirmed that --threads 1 runs without errors. :)
I will download and recompile the latest MMseqs2 and test.
colabfold_batch creates a model but errors with.
_tkinter.TclError: couldn't connect to display ":100"
localcolabfold/colabfold_batch/bin/colabfold_batch completes with correctly written out png files. Does the "regular" colabfold_batch output try to write out the images to the display? (is there a switch to turn this off?) Or otherwise is there any reason not to use the colabfold_batch from localcolabfold? Thanks for your help and patience.
Drew
confirmed that the latest commit at MMseqs2 resolves the error
Thanks Martin!
Drew
This issue is similar to #142 which was closed but no solution was provided. Rather than use the webserver (https://a3m.mmseqs.com/) I would prefer to run the MMseq2 locally with the --host-url switch. MMseqs2 works fine on my HPC (installed either through conda or compiled), and I can install and set up the databases described here (https://colabfold.mmseqs.com/), but this is not a webserver per se with a url. Is there a way to simply run the MSA using MMseq2 CLI locally?
Perhaps there is something I am not understanding. Thanks so much for your help!
Drew