Open zzy221127 opened 1 year ago
Could you please check for your cuda environment, suppose you should have your nvcc compiler.
nvcc -V
If you do not have cuda compiler. conda environment maybe only contain cuda runtime. So you can choose to install fully CUDA environment from NVIDIA website or you can try to install development environment in conda.
thankyou, below is what 'nvcc -V' shows, it seems the cuda compiler is already in
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Jun__8_16:49:14_PDT_2022 Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0
Ok, could you please provide your cuda path with which nvcc
, and the way you install triton
.
The simple way is to uninstall triton
, and the code will fallback to cuda kernel.
thankyou so much! After your kindly remind, it find out to be the installion problem with triton
.
I first install triton
with command:
pip install triton==2.0.0.dev20221005
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting triton==2.0.0.dev20221005
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/11/f3/db2d366485b3160419f8415e0293aac6daaa018d7a02b9c0a40f89a137bf/triton-2.0.0.dev20221005-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.7 MB)
Requirement already satisfied: torch in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0.dev20221005) (1.13.0+cu117)
Requirement already satisfied: filelock in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0.dev20221005) (3.8.0)
Requirement already satisfied: cmake in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0.dev20221005) (3.24.3)
Requirement already satisfied: typing-extensions in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from torch->triton==2.0.0.dev20221005) (4.4.0)
Installing collected packages: triton
Successfully installed triton-2.0.0.dev20221005
I seems ok.
then, I used the following command to install triton
again.
git clone https://github.com/openai/triton.git ~/triton \
&& cd ~/triton/python \
&& pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple --default-timeout=10000000
and got the error message below, do you have any suggestions for this?
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///home/triton/python
Preparing metadata (setup.py) ... done
Requirement already satisfied: cmake in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0) (3.24.3)
Requirement already satisfied: filelock in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0) (3.8.0)
Requirement already satisfied: torch in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from triton==2.0.0) (1.13.0+cu117)
Requirement already satisfied: typing-extensions in /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages (from torch->triton==2.0.0) (4.4.0)
Installing collected packages: triton
Attempting uninstall: triton
Found existing installation: triton 2.0.0
Uninstalling triton-2.0.0:
Successfully uninstalled triton-2.0.0
Running setup.py develop for triton
error: subprocess-exited-with-error
× python setup.py develop did not run successfully.
│ exit code: 1
╰─> [59 lines of output]
running develop
running egg_info
writing triton.egg-info/PKG-INFO
writing dependency_links to triton.egg-info/dependency_links.txt
writing requirements to triton.egg-info/requires.txt
writing top-level names to triton.egg-info/top_level.txt
reading manifest file 'triton.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'triton.egg-info/SOURCES.txt'
running build_ext
/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/home/triton/python/setup.py", line 152, in <module>
setup(
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
return distutils.core.setup(**attrs)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
self.run_command(cmd)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/develop.py", line 34, in run
self.install_for_development()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/develop.py", line 114, in install_for_development
self.run_command('build_ext')
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
self.distribution.run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/triton/python/setup.py", line 114, in run
self.build_extension(ext)
File "/home/triton/python/setup.py", line 118, in build_extension
thirdparty_cmake_args = get_thirdparty_packages(triton_cache_path)
File "/home/triton/python/setup.py", line 74, in get_thirdparty_packages
file.extractall(path=package_root_dir)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2028, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2069, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2141, in _extract_member
self.makefile(tarinfo, targetpath)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2190, in makefile
copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 249, in copyfileobj
raise exception("unexpected end of data")
tarfile.ReadError: unexpected end of data
downloading and extracting https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.4/clang+llvm-15.0.4-powerpc64le-linux-ubuntu-18.04.5.tar.xz ...
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
Rolling back uninstall of triton
Moving to /home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/triton.egg-link
from /tmp/pip-uninstall-q6f21a3r/triton.egg-link
error: subprocess-exited-with-error
× python setup.py develop did not run successfully.
│ exit code: 1
╰─> [59 lines of output]
running develop
running egg_info
writing triton.egg-info/PKG-INFO
writing dependency_links to triton.egg-info/dependency_links.txt
writing requirements to triton.egg-info/requires.txt
writing top-level names to triton.egg-info/top_level.txt
reading manifest file 'triton.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'triton.egg-info/SOURCES.txt'
running build_ext
/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/home/triton/python/setup.py", line 152, in <module>
setup(
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
return distutils.core.setup(**attrs)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
self.run_command(cmd)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/develop.py", line 34, in run
self.install_for_development()
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/command/develop.py", line 114, in install_for_development
self.run_command('build_ext')
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
self.distribution.run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/triton/python/setup.py", line 114, in run
self.build_extension(ext)
File "/home/triton/python/setup.py", line 118, in build_extension
thirdparty_cmake_args = get_thirdparty_packages(triton_cache_path)
File "/home/triton/python/setup.py", line 74, in get_thirdparty_packages
file.extractall(path=package_root_dir)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2028, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2069, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2141, in _extract_member
self.makefile(tarinfo, targetpath)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 2190, in makefile
copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/tarfile.py", line 249, in copyfileobj
raise exception("unexpected end of data")
tarfile.ReadError: unexpected end of data
downloading and extracting https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.4/clang+llvm-15.0.4-powerpc64le-linux-ubuntu-18.04.5.tar.xz ...
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
The log shows that maybe the network problem, you can not download the llvm from github. You should use pip install triton==2.0.0.dev20221005
to install specify version triton. The main branch of triton is not stable. If you struggle with triton, just uninstall it and run again.
Dear Shenggan:
by uninstall triton
, I successful run out the inference.py scripts with no error print.
the out put is one relaxed.pdb, one unrelaxed.pbd, with one " alignments" folder , right?
Although I definitely feel much faster than runing alphafold2,
but i woundering without triton
, am i " leverage the power of FastFold" ?
The expected output file is correct.
You can already get great acceleration with the cuda kernel when triton is not installed. Triton kernel is currently experimental. It can have some acceleration effect on NVIDIA Ampere platform (maybe 10%~20%).
I think you can try to use triton==2.0.0.dev20221005
and figure out why it can not find cuda.h
. I think you can try to set environment variables CUDA_HOME to your cuda path.
Dear author:
I try to test Fastfold, after followed the Installation Using Conda, (i think there are no command to test for a successful installation)
I run inference.py with the following code:
################################# conda activate fastfold python /home/FastFold/inference.py used.fasta /database/alphafold2-data/pdb_mmcif/mmcif_files/ \ --output_dir /mydir/output \ --cpus 80 \ --gpus 3 \ --param_path /database/alphafold2-data/params/params_model_1.npz \ --uniref90_database_path /database/alphafold2-data/uniref90/uniref90.fasta \ --mgnify_database_path /database/alphafold2-data/mgnify/mgy_clusters_2018_12.fa \ --pdb70_database_path /database/alphafold2-data/pdb70/pdb70 \ --uniclust30_database_path /database/alphafold2-data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \ --bfd_database_path /database/alphafold2-data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \ --jackhmmer_binary_path /home/Software/miniconda3/envs/fastfold/bin/jackhmmer \ --hhblits_binary_path /home/Software/miniconda3/envs/fastfold/bin/hhblits \ --hhsearch_binary_path /home/Software/miniconda3/envs/fastfold/bin/hhsearch \ --kalign_binary_path /home/Software/miniconda3/envs/fastfold/bin/kalign #################################
It seems right at the jackhmmer→hhsearch→jackhmmer→hhblits steps
then I meet error print as follow:
I woundering what they hints and what should i do to run fastfold properly?
##########error message##################
/tmp/tmp4wm30exa/main.c:2:10: fatal error: cuda.h: No such file or directory 2 | #include "cuda.h" | ^
main(args)
File "/home/FastFold/inference.py", line 150, in main
inference_monomer_model(args)
File "/home/FastFold/inference.py", line 415, in inference_monomer_model
torch.multiprocessing.spawn(inference_model, nprocs=args.gpus, args=(args.gpus, result_q, batch, args))
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
~~~ /tmp/tmp65558a3s/main.c:2:10: fatal error: cuda.h: No such file or directory 2 | #include "cuda.h" | ^~~~ compilation terminated. compilation terminated. Traceback (most recent call last): File "/home/FastFold/inference.py", line 513, in-- Process 0 terminated with the following error: Traceback (most recent call last): File "", line 21, in _layer_norm_fwd_fused
KeyError: ('2-.-0-.-0-d82511111ad128294e9d31a6ac684238-7929002797455b30efce6e41eddc6b57-3aa563e00c5c695dd945e23b09a86848-bb0203f280ee2aaa28bc6e4eff4090f3-ff946bd4b3b4a4cbdf8cedc6e1c658e0-5c5e32ff210f3b7f56c98ca29917c25e-06f0df2d61979d629033f4a22eff5198-0dd03b0bd512a184b3512b278d9dfa59-d35ab04ae841e2714a253c523530b071', (torch.float32, torch.float32, torch.float32, torch.float32, torch.float32, torch.float32, 'i32', 'i32', 'fp32'), (256,), (True, True, True, True, True, True, (True, False), (True, False), (False,)))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, args) File "/home/FastFold/inference.py", line 135, in inference_model out = model(batch) File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(input, kwargs) File "/home/FastFold/fastfold/model/hub/alphafold.py", line 507, in forward outputs, m_1_prev, z_prev, x_prev = self.iteration( File "/home/FastFold/fastfold/model/hub/alphafold.py", line 232, in iteration m_1_prev, z_prev = self.recycling_embedder( File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, *kwargs) File "/home/FastFold/fastfold/model/fastnn/ops.py", line 1097, in forward m_update = self.layer_norm_m(m) File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(input, kwargs) File "/home/FastFold/fastfold/model/fastnn/kernel/layer_norm.py", line 52, in forward return self.kernel_forward(input) File "/home/FastFold/fastfold/model/fastnn/kernel/layer_norm.py", line 56, in kernel_forward return LayerNormTritonFunc.apply(input, self.normalized_shape, self.weight, self.bias, File "/home/FastFold/fastfold/model/fastnn/kernel/triton/layer_norm.py", line 164, in forward _layer_norm_fwd_fused[(M,)]( File "/home/triton/python/triton/runtime/jit.py", line 106, in launcher return self.run(*args, grid=grid, **kwargs) File "", line 41, in _layer_norm_fwd_fused
File "/home/triton/python/triton/compiler.py", line 1239, in compile
so = _build(fn.name, src_path, tmpdir)
File "/home/triton/python/triton/compiler.py", line 1169, in _build
ret = subprocess.check_call(cc_cmd)
File "/home/Software/miniconda3/envs/fastfold/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmp65558a3s/main.c', '-O3', '-I/usr/local/cuda/include', '-I/home/Software/miniconda3/envs/fastfold/include/python3.8', '-I/tmp/tmp65558a3s', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmp65558a3s/_layer_norm_fwd_fused.cpython-38-x86_64-linux-gnu.so', '-L/usr/lib/x86_64-linux-gnu']' returned non-zero exit status 1.