ma-compbio / Higashi

single-cell Hi-C, scHi-C, Hi-C, 3D genome, nuclear organization, hypergraph
MIT License
78 stars 11 forks source link

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (15361,) + inhomogeneous part. #47

Closed GMFranceschini closed 7 months ago

GMFranceschini commented 8 months ago

Dear developers, I encounter this error when running higashi_model.train_for_imputation_nbr_0().

[...]/mamba_root/envs/higashi/lib/python3.9/site-packages/higashi-0.1.0a0-py3.9.egg/higashi/Higashi_wrapper.py", line 321, in one_thread_generate_neg
    to_neighs = np.array(to_neighs)[:-1]
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (15361,) + inhomogeneous part.
"""

Do you have any clues about what might be the problem? Maybe my version of Numpy is too recent? Any advice is appreciated.

My script looks like this.

from higashi.Higashi_wrapper import *
from fasthigashi.FastHigashi_Wrapper import *
config = './../higashi_WGD/config_wgd_chr21.JSON'
higashi_model = Higashi(config)
import os

num_gpus = os.environ['CUDA_VISIBLE_DEVICES'].split(',').__len__()
print(num_gpus)
print(torch.cuda.is_available())

local_rank = 0 
torch.cuda.set_device(local_rank)

os.environ['CUDA_VISIBLE_DEVICES'] = ','.join(f'{i}' for i in range(num_gpus))

print("Processing data...")
higashi_model.process_data()

print("Initializing...")
# Initialize the model
fh_model = FastHigashi(
    config_path=config,
    path2input_cache="../higashi_WGD/tmp/",
    path2result_dir="../higashi_WGD/tmp/",
    off_diag=100,  # 0-100th diag of the contact maps would be used.
    filter=False,  # fit the model on high quality cells, transform the rest
    do_conv=False,  # linear convolution imputation
    do_rwr=False,  # partial random walk with restart imputation
    do_col=False,  # sqrt_vc normalization
    no_col=False,
)  # force to not do sqrt_vc normalization

print("Preparing...")
# Pack from sparse mtx to tensors
fh_model.prep_dataset()

print("Running...")
fh_model.run_model(dim1=0.6, rank=16, n_iter_parafac=1, extra="WGD")

print("Preparing and training...")
higashi_model.prep_model()
# Stage 1 training
higashi_model.train_for_embeddings()

print("Preparing for inputation...")
higashi_model.train_for_imputation_nbr_0()
higashi_model.impute_no_nbr()

higashi_model.train_for_imputation_with_nbr()
higashi_model.impute_with_nbr()

Here is my conda env:

higashi.txt

Unrelated but maybe useful for you: I had to force the cuda device to be zero in get_free_gpu() to be used in a Slurm cluster. It oddly kept switching to GPU 1 when only a GPU was requested (GPU 0) - blocking execution.

ruochiz commented 8 months ago

Hum. I'm guessing it might due to some numpy autobroadcasting issue. Could you try to replace that line of code in Higashi_wrapper.py from to_neighs = np.array(to_neighs)[:-1] to to_neighs = np.array(to_neighs, dtype='object')[:-1] and see if that will fix the error? If so I'll update the repo accordingly. Thanks!

And regarding the gpu error. Thanks for letting me know. I think it's likely because I use nvidia-smi in command line to get the gpu with the largest available gpu mem. For machines on slurm, it's possible that even if you are assigned with gpu:0, nvidia-smi still returns other available gpus, and they happen to have slightly more gpu memory. I'll think of something to fix it.

GMFranceschini commented 7 months ago

Thank you! I confirm that the fix worked. Now training is running, and I am no longer observing the error.

Also, you are right about the GPU; that is exactly what is happening in our case, as we have 2 GPU nodes together. Unfortunately, I am encountering some problems in training on the Slurm cluster via GPU; namely, the step of training for imputation gets stuck at never progresses. I will investigate this further, as everything works locally on my GPU.

ruochiz commented 7 months ago

Thanks for the confirmation!