scverse / scvi-tools

Deep probabilistic analysis of single-cell and spatial omics data
http://scvi-tools.org/
BSD 3-Clause "New" or "Revised" License
1.16k stars 341 forks source link

Trying to run PeakVI model returns mpi4py error #2836

Open Sun-storm opened 2 weeks ago

Sun-storm commented 2 weeks ago

Whenever I try to run the model or (after having run the model somewhere else) I try to access it with model = scvi.model.PEAKVI.load(model_dir, adata=adata) I get the bug shown bellow. I've tried installing openMPI and quite a few other things, but nothing seems to work.

python
Python 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:50:58) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import tempfile
>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> import pooch
>>> import scanpy as sc
>>> import scvi
>>> import torch
>>> os.chdir('/home/.../data')
5ad')
model_dir = os.path.join("model/peakvi_trained", "model.pt")
model = scvi.model.PEAKVI.load(model_dir, adata=adata)
>>> adata = scvi.data.read_h5ad('combined_seurat_object.h5ad')
>>> model_dir = os.path.join("model/peakvi_trained", "model.pt")
>>> model = scvi.model.PEAKVI.load(model_dir, adata=adata)

/services/tools/scvi-tools/1.1.2/lib/python3.12/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")

--------------------------------------------------------------------------------
<stdin> 1 <module>
1

_base_model.py 662 load
_, _, device = parse_device_args(

_utils.py 101 parse_device_args
connector = _AcceleratorConnector(accelerator=accelerator, devices=devices)

accelerator_connector.py 152 __init__
self.cluster_environment: ClusterEnvironment = self._choose_and_init_cluster_environment()

accelerator_connector.py 421 _choose_and_init_cluster_environment
if env_type.detect():

mpi.py 71 detect
from mpi4py import MPI

ImportError:
libmpi.so.40: cannot open shared object file: No such file or directory

Versions:

1.1.3

canergen commented 2 weeks ago

This seems to be a problem with your CUDA installation on a specific workstations. Actually torch can't initialize. Can you check that: nvidia-smi works within the command line and that functions like: `>>> import torch

torch.cuda.is_available() True

torch.cuda.device_count() 1

torch.cuda.current_device() 0

torch.cuda.device(0) <torch.cuda.device at 0x7efce0b03be0>

torch.cuda.get_device_name(0) 'GeForce GTX 950M'`

If these don't work and you can't set up a new environment with only installing a CUDA enabled torch, you might want to check with your system administrator or update your CUDA drivers yourself.