DeepGraphLearning / torchdrug

A powerful and flexible machine learning platform for drug discovery
https://torchdrug.ai/
Apache License 2.0
1.43k stars 199 forks source link

torch_ext.so cannot be found error on Macbook macOS #229

Open GZ82 opened 11 months ago

GZ82 commented 11 months ago

when I run: solver.train(num_epoch=10) in following code:

optimizer = torch.optim.Adam(task.parameters(), lr=1e-3) solver = core.Engine(task, train_set, valid_set, test_set, optimizer, gpus=None, batch_size=1024) # solver.train(num_epoch=10) solver.evaluate("valid")

(I am using cpu instead of gpus, since there is no cuba support on macbook, wondering if torchdrug support mps or not)

I get this error at first run, seems related to ninja: File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'torch_ext': [1/2] c++ -MMD -MF torch_ext.o.d -DTORCH_EXTENSION_NAME=torch_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_clang\" -DPYBIND11_STDLIB=\"_libcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1002\" -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/TH -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/THC -isystem /Users/Guo/.conda/envs/torchdrug/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Ofast -fopenmp -DAT_PARALLEL_OPENMP -c /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/extension/torch_ext.cpp -o torch_ext.o FAILED: torch_ext.o c++ -MMD -MF torch_ext.o.d -DTORCH_EXTENSION_NAME=torch_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_clang\" -DPYBIND11_STDLIB=\"_libcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1002\" -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/TH -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/THC -isystem /Users/Guo/.conda/envs/torchdrug/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Ofast -fopenmp -DAT_PARALLEL_OPENMP -c /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/extension/torch_ext.cpp -o torch_ext.o clang: error: unsupported option '-fopenmp' ninja: build stopped: subcommand failed.

if I run: solver.train(num_epoch=10) again the error change to:

File "<frozen importlib._bootstrap>", line 571, in module_from_spec File "<frozen importlib._bootstrap_external>", line 1176, in create_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed ImportError: dlopen(/Users/Guo/Library/Caches/torch_extensions/py310_cpu/torch_ext_0/torch_ext.so, 0x0002): tried: '/Users/Guo/Library/Caches/torch_extensions/py310_cpu/torch_ext_0/torch_ext.so' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/Users/Guo/Library/Caches/torch_extensions/py310_cpu/torch_ext_0/torch_ext.so' (no such file), '/Users/Guo/Library/Caches/torch_extensions/py310_cpu/torch_ext_0/torch_ext.so' (no such file)

which seems related to "torch_extensions" "cpu" and "torch_ext.so" tried the solution in issue #8 : rm -rf /home/your_user_name/.cache/torch_extensions in fact I cannot find torch_extensions under this folder

tried to add the path of ninja in code: ninja_dir = "/Users/Guo/.conda/envs/torchdrug/bin/ninja" if ninja_dir not in os.environ['PATH']: os.environ['PATH'] = f"{ninja_dir}:{os.environ['PATH']}"

but does not solve the ninja issue, not sure if two problems are related or not

other info:

python version: 3.10.8

torchdrug installation: under a conda environment pip install ninja pip install torch==1.13.0 pip install git+https://github.com/rusty1s/pytorch_scatter.git pip install git+https://github.com/rusty1s/pytorch_cluster.git pip install torchdrug

GZ82 commented 10 months ago

try to run the same code from command line, errors become different seems still related to ninja and torch_ext.o

I also tried to add the path of ninja and python in my conda environment: export PATH=/Users/Guo/.conda/envs/torchdrug/bin/:$PATH here is the error


  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
    subprocess.run(
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/Guo/Pcode/Projects/CDI/getrightmol/src/getrightmol/models/torchdrug_local.py", line 59, in <module>
    solver.train(num_epoch=10)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/core/engine.py", line 161, in train
    loss, metric = model(batch)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/tasks/property_prediction.py", line 102, in forward
    pred = self.predict(batch, all_loss, metric)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/tasks/property_prediction.py", line 140, in predict
    output = self.model(graph, graph.node_feature.float(), all_loss=all_loss, metric=metric)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/models/gin.py", line 76, in forward
    hidden = layer(graph, layer_input)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/layers/conv.py", line 91, in forward
    update = self.message_and_aggregate(graph, input)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/layers/conv.py", line 337, in message_and_aggregate
    adjacency = utils.sparse_coo_tensor(graph.edge_list.t()[:2], graph.edge_weight,
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/torch.py", line 185, in sparse_coo_tensor
    return torch_ext.sparse_coo_tensor_unsafe(indices, values, size)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/torch.py", line 26, in __getattr__
    return getattr(self.module, key)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/decorator.py", line 102, in __get__
    result = self.func(obj)
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/torch.py", line 30, in module
    return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags,
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
    return _jit_compile(
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'torch_ext': [1/2] c++ -MMD -MF torch_ext.o.d -DTORCH_EXTENSION_NAME=torch_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_clang\" -DPYBIND11_STDLIB=\"_libcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1002\" -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/TH -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/THC -isystem /Users/Guo/.conda/envs/torchdrug/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Ofast -fopenmp -DAT_PARALLEL_OPENMP -c /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/extension/torch_ext.cpp -o torch_ext.o 
FAILED: torch_ext.o 
c++ -MMD -MF torch_ext.o.d -DTORCH_EXTENSION_NAME=torch_ext -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_clang\" -DPYBIND11_STDLIB=\"_libcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1002\" -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/TH -isystem /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torch/include/THC -isystem /Users/Guo/.conda/envs/torchdrug/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Ofast -fopenmp -DAT_PARALLEL_OPENMP -c /Users/Guo/.conda/envs/torchdrug/lib/python3.10/site-packages/torchdrug/utils/extension/torch_ext.cpp -o torch_ext.o 
clang: error: unsupported option '-fopenmp'
ninja: build stopped: subcommand failed.```
elantrean commented 8 months ago

after i execute pip install torchtext and reinstall torch==1.13.0 by pip install torch==1.13.0 the training process worked here is the output of pip install

❯ pip install torchtext 
Collecting torchtext
  Downloading torchtext-0.16.2-cp39-cp39-macosx_11_0_arm64.whl.metadata (7.5 kB)
Requirement already satisfied: tqdm in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torchtext) (4.66.1)
Collecting requests (from torchtext)
  Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting torch==2.1.2 (from torchtext)
  Downloading torch-2.1.2-cp39-none-macosx_11_0_arm64.whl.metadata (25 kB)
Requirement already satisfied: numpy in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torchtext) (1.26.3)
Collecting torchdata==0.7.1 (from torchtext)
  Downloading torchdata-0.7.1-cp39-cp39-macosx_11_0_arm64.whl.metadata (13 kB)
Collecting filelock (from torch==2.1.2->torchtext)
  Downloading filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: typing-extensions in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torch==2.1.2->torchtext) (4.7.1)
Collecting sympy (from torch==2.1.2->torchtext)
  Downloading sympy-1.12-py3-none-any.whl (5.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 8.6 MB/s eta 0:00:00
Requirement already satisfied: networkx in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torch==2.1.2->torchtext) (3.2.1)
Requirement already satisfied: jinja2 in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torch==2.1.2->torchtext) (3.1.2)
Collecting fsspec (from torch==2.1.2->torchtext)
  Downloading fsspec-2023.12.2-py3-none-any.whl.metadata (6.8 kB)
Collecting urllib3>=1.25 (from torchdata==0.7.1->torchtext)
  Downloading urllib3-2.1.0-py3-none-any.whl.metadata (6.4 kB)
Collecting charset-normalizer<4,>=2 (from requests->torchtext)
  Downloading charset_normalizer-3.3.2-cp39-cp39-macosx_11_0_arm64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests->torchtext)
  Downloading idna-3.6-py3-none-any.whl.metadata (9.9 kB)
Collecting certifi>=2017.4.17 (from requests->torchtext)
  Downloading certifi-2023.11.17-py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from jinja2->torch==2.1.2->torchtext) (2.1.3)
Collecting mpmath>=0.19 (from sympy->torch==2.1.2->torchtext)
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 45.2 MB/s eta 0:00:00
Downloading torchtext-0.16.2-cp39-cp39-macosx_11_0_arm64.whl (2.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 16.8 MB/s eta 0:00:00
Downloading torch-2.1.2-cp39-none-macosx_11_0_arm64.whl (59.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.6/59.6 MB 15.8 MB/s eta 0:00:00
Downloading torchdata-0.7.1-cp39-cp39-macosx_11_0_arm64.whl (4.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.8/4.8 MB 23.7 MB/s eta 0:00:00
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 2.5 MB/s eta 0:00:00
Downloading certifi-2023.11.17-py3-none-any.whl (162 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.5/162.5 kB 14.5 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp39-cp39-macosx_11_0_arm64.whl (120 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.4/120.4 kB 9.2 MB/s eta 0:00:00
Downloading idna-3.6-py3-none-any.whl (61 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.6/61.6 kB 3.5 MB/s eta 0:00:00
Downloading urllib3-2.1.0-py3-none-any.whl (104 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.6/104.6 kB 8.7 MB/s eta 0:00:00
Downloading filelock-3.13.1-py3-none-any.whl (11 kB)
Downloading fsspec-2023.12.2-py3-none-any.whl (168 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 169.0/169.0 kB 16.9 MB/s eta 0:00:00
Installing collected packages: mpmath, urllib3, sympy, idna, fsspec, filelock, charset-normalizer, certifi, torch, requests, torchdata, torchtext
  Attempting uninstall: torch
    Found existing installation: torch 1.13.0
    Uninstalling torch-1.13.0:
      Successfully uninstalled torch-1.13.0
Successfully installed certifi-2023.11.17 charset-normalizer-3.3.2 filelock-3.13.1 fsspec-2023.12.2 idna-3.6 mpmath-1.3.0 requests-2.31.0 sympy-1.12 torch-2.1.2 torchdata-0.7.1 torchtext-0.16.2 urllib3-2.1.0
(torchdrug) 
░▒▓   …/Development/test/torchdrug   20:18  
❯ pip install torch==1.13.0
Collecting torch==1.13.0
  Using cached torch-1.13.0-cp39-none-macosx_11_0_arm64.whl (55.7 MB)
Requirement already satisfied: typing-extensions in /Users/cdove/mambaforge/envs/torchdrug/lib/python3.9/site-packages (from torch==1.13.0) (4.7.1)
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 2.1.2
    Uninstalling torch-2.1.2:
      Successfully uninstalled torch-2.1.2
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchdata 0.7.1 requires torch>=2, but you have torch 1.13.0 which is incompatible.
torchtext 0.16.2 requires torch==2.1.2, but you have torch 1.13.0 which is incompatible.
Successfully installed torch-1.13.0