YoshitakaMo / localcolabfold

ColabFold on your local PC
MIT License
601 stars 134 forks source link

colabfold_batch tries to connect to internet even if msa provided as parameter #211

Open jflucier opened 9 months ago

jflucier commented 9 months ago

Hi,

I have sucessfully runned colabfold_search locally using the following command:

colabfold_search \
--threads 32 --use-env 1 --use-templates 1 \
--mmseqs mmseqs \
--db1 /home/jflucier/projects/def-marechal/programs/colabfold_db/uniref30_2302_db \
--db2 /home/jflucier/projects/def-marechal/programs/colabfold_db/pdb100_230517 \
--db3 /home/jflucier/projects/def-marechal/programs/colabfold_db/colabfold_envdb_202108_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa /home/jflucier/projects/def-marechal/programs/colabfold_db /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas

When I run colabfold_batch using the msa, the script still tries to connect to internet:

colabfold_batch \
--use-gpu-relax --amber --num-relax 3 \
--num-models 3 --templates \
--num-recycle 30 --recycle-early-stop-tolerance 0.5 \
--model-type auto \
--data /home/jflucier/projects/def-marechal/colabfold_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test

returns the follwing error:

+ echo 'running colabfold on /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa'
running colabfold on /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa
+ colabfold_batch --use-gpu-relax --amber --num-relax 3 --num-models 3 --templates --num-recycle 30 --recycle-earl
y-stop-tolerance 0.5 --model-type auto --data /home/jflucier/projects/def-marechal/colabfold_db /home/jflucier/pro
jects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m /home/jflucier/projects/def-marechal/programs/local
colabfold_env/test
2024-02-06 11:27:10,448 Running colabfold 1.5.2 (3e99c44eec189ec27f6d120af851adb7ff6aa2a2)
Traceback (most recent call last):
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/util/connection.py", line 84, in create_connection
    raise err
  File "/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/scipy-stack/2020b/lib/python3.8/site-packag
es/urllib3/util/connection.py", line 74, in create_connection
    sock.connect(sa)
OSError: [Errno 101] Network is unreachable

I have tried passing /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/ or /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas/0.a3m and both try returns the same network connection error.

The content of msa folder only has file 0.a3m

Can you please guide me on what I am doing wrong. Based on previous answers you have given me issue #184, colabfold_batch should not connect to network if msa is provided.

Thank you very much for your help

YoshitakaMo commented 9 months ago

Please upgrade to ColabFold 1.5.5 (latest) to use new --pdb-hit-file and --local-pdb-path args. See also https://github.com/sokrypton/ColabFold/issues/563 . If you set --templates arg but without these two args, ColabFold will try to search templates through the Internet. But, we are aware that issues are being reported when using local templates for some cases. We are currently working on fixing them.

jflucier commented 9 months ago

Hi again,

Thank you very much for your awesome support.

I have rerun colabfold_search and remove template option:

colabfold_search \
--threads 32 --use-env 1 --db-load-mode 0 \
--mmseqs mmseqs \
--db1 /home/jflucier/projects/def-marechal/programs/colabfold_db/uniref30_2302_db \
--db2 /home/jflucier/projects/def-marechal/programs/colabfold_db/pdb100_230517 \
--db3 /home/jflucier/projects/def-marechal/programs/colabfold_db/colabfold_envdb_202108_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/DTX1_DTX2.fa /home/jflucier/projects/def-marechal/programs/colabfold_db /home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas3

This produces a3m file (no m8 file). Then I run colabfold_batch (again no template option provided):

colabfold_batch \
--use-gpu-relax --amber --num-relax 3 \
--num-models 3 \
--num-recycle 30 --recycle-early-stop-tolerance 0.5 \
--model-type auto \
--data /home/jflucier/projects/def-marechal/colabfold_db \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test/msas3/0.a3m \
/home/jflucier/projects/def-marechal/programs/localcolabfold_env/test

I get exact same error:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /alphafold/alphafold_params_colab_2022-12-06.tar (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x1554683ba340>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

thanks again for your help

ccoulombe commented 9 months ago

Note that this is executed on an HPC system, where the compute node do not have access to the outside world. It would be very beneficial to fix all issues to ensure that it can run without access to an outside network (internet) on HPC systems.

allcatsaregrey commented 7 months ago

Also have this issue on an HPC system with localcolabfold. All the MSAs are precomputed and I am not using templates.

jflucier commented 7 months ago

Hi @allcatsaregrey

I managed to get environment working by rebuilding it from scratch.

Attache is my environment file venv.colabfold.af2.3.2.requirements.txt

allcatsaregrey commented 7 months ago

Fresh install does not appear to work for me sadly.