Open rafaeltiveron opened 2 months ago
I've reduced from 8 to 2 errors last, when Line 79 of the file tests/compare_utils.py is modified to:
_param_path = os.path.join(dir_path, "/<directory until>/openfold/resources/params", f"params_{consts.model}.npz")
From 2 to 1 error, when Line 405 of the file tests/test_import_weights.py is modified to:
npz_path = Path(__file__).parent.resolve() / f"openfold/resources/params/params_{consts.model}.npz"
, considering ln -s /<directory until>/openfold /<directory until>/openfold/tests/openfold
.
The following error persists:
ERROR: test_lma_vs_attention (tests.test_primitives.TestLMA)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/openfold/tests/test_primitives.py", line 45, in test_lma_vs_attention
l = a(q, kv, biases=biases, use_lma=True).cpu()
File "/usr/local/miniforge/mambaforge/envs/openfold_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/miniforge/mambaforge/envs/openfold_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/openfold/openfold/model/primitives.py", line 537, in forward
o = _lma(q, k, v, biases, lma_q_chunk_size, lma_kv_chunk_size)
File "/usr/local/openfold/openfold/model/primitives.py", line 733, in _lma
a = torch.einsum(
File "/usr/local/miniforge/mambaforge/envs/openfold_env/lib/python3.10/site-packages/torch/functional.py", line 377, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.62 GiB. GPU 0 has a total capacty of 7.66 GiB of which 1.08 GiB is free. Including non-PyTorch memory, this process has 5.88 GiB memory in use. Of the allocated memory 4.11 GiB is allocated by PyTorch, and 1.65 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Some memory variation, but still the same.
I've added in the start of openfold/tests/test_primitives.py
and openfold/openfold/model/primitives.py
:
import os
# Configuration to avoid GPU memory fragmentation:
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:1024'
It has reduced from 5 Gb to 3 Gb usable memory when prediction is started, but still not sufficient to conclude with success this test task. Still out of memory. I don't know how to use "use_lma": true
for tests.test_primitives.TestLMA
. Maybe it can reduce more memory consumption. Using NVIDIA GeForce RTX 4060.
I've just run
scripts/install_third_party_dependencies.sh
andpython3 setup.py install
. No erros during the instalation.resouces
folder was created inside openfold folder, but: