DeepGraphLearning / NBFNet

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)
MIT License
195 stars 29 forks source link

Unable to run the code with error regarding 'mpiicpc' #4

Open VeritasYin opened 2 years ago

VeritasYin commented 2 years ago

Hello,

I followed the instruction to install the torchdrug-related packages and matching PyTorch/CUDA version. However, I got this following error when initializing the code. Any ideas to fix this? The system has intel/19.0.3.199 loaded.

01:24:15   Epoch 0 begin
Traceback (most recent call last):
  File "script/run.py", line 62, in <module>
    train_and_validate(cfg, solver)
  File "script/run.py", line 27, in train_and_validate
    solver.train(**kwargs)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/core/engine.py", line 143, in train
    loss, metric = model(batch)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/tasks/reasoning.py", line 85, in forward
    pred = self.predict(batch, all_loss, metric)
  File "~/Workspace/Python/NBFNet/nbfnet/task.py", line 288, in predict
    pred = self.model(graph, h_index, t_index, r_index, all_loss=all_loss, metric=metric)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 149, in forward
    output = self.bellmanford(graph, h_index[:, 0], r_index[:, 0])
  File "<decorator-gen-888>", line 2, in bellmanford
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 56, in wrapper
    return forward(self, *args, **kwargs)
  File "~/Workspace/Python/NBFNet/nbfnet/model.py", line 115, in bellmanford
    hidden = layer(step_graph, layer_input)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/layers/conv.py", line 91, in forward
    update = self.message_and_aggregate(graph, input)
  File "~/Workspace/Python/NBFNet/nbfnet/layer.py", line 124, in message_and_aggregate
    adjacency = graph.adjacency.transpose(0, 1)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__
    result = self.func(obj)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/data/graph.py", line 658, in adjacency
    return utils.sparse_coo_tensor(self.edge_list.t(), self.edge_weight, self.shape)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 182, in sparse_coo_tensor
    return torch_ext.sparse_coo_tensor_unsafe(indices, values, size)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 27, in __getattr__
    return getattr(self.module, key)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/decorator.py", line 21, in __get__
    result = self.func(obj)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torchdrug/utils/torch.py", line 31, in module
    return cpp_extension.load(self.name, self.sources, self.extra_cflags, self.extra_cuda_cflags,
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1079, in load
    return _jit_compile(
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1292, in _jit_compile
    _write_ninja_file_and_build_library(
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1378, in _write_ninja_file_and_build_library
    check_compiler_abi_compatibility(compiler)
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 282, in check_compiler_abi_compatibility
    if not check_compiler_ok_for_platform(compiler):
  File "~/anaconda3/envs/dlg_env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 249, in check_compiler_ok_for_platform
    version_string = subprocess.check_output([compiler, '-v'], stderr=subprocess.STDOUT).decode()
  File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "~/anaconda3/envs/dlg_env/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['icpc', '-v']' returned non-zero exit status 1.
KiddoZhu commented 2 years ago

Hi! This is an error in PyTorch JIT. From my experience, PyTorch JIT only works correctly with g++ (Linux), cl (Windows) and c++ (macOS). It might have errors with intel compilers, as some of its command line arguments are different from g++. I would suggest switching to g++ and it should be solved.

AnupKumarGupta commented 1 month ago

Hello, @VeritasYin Any update on how to resolve the issue? I am facing a similar issue. Any help will be greatly appreciated. Thanks in advance.