single-cell-genetics / vireo

Demultiplexing pooled scRNA-seq data with or without genotype reference
https://vireoSNP.readthedocs.io
Apache License 2.0
71 stars 25 forks source link

ValueError: column index exceeds matrix dimensions #104

Open hsymoon opened 2 months ago

hsymoon commented 2 months ago

Hi eveyone, I want to verify the best donor numbers of my vcf-data,which is the outputs of cellsnp-lite. I tried the demo command developers provided as follows:
`cell_vcf = vireoSNP.load_VCF('batch1_cellSNP.cells.vcf.gz', biallelic_only=True)

cell_dat = vireoSNP.vcf.read_sparse_GeneINFO(cell_vcf['GenoINFO'], keys=['AD', 'DP'])

AD = cell_dat['AD']

DP = cell_dat['DP']

n_donor_list = np.arange(2, 7)

ELBO_list_all = []

for _n_don in n_donor_list: res = vireoSNP.vireo_wrap(AD, DP, n_donor=_n_don, learn_GT=True, n_extra_donor=0, ASE_mode=False, fix_beta_sum=False, n_init=50, check_doublet=True, random_seed=1) ELBO_list_all.append(res['LB_list']) `

but I got the reply as follows:

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/public/home/syhou/miniconda3/lib/python3.10/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 17, in _model_fit _model.fit(AD, DP, min_iter=5, max_iter=max_iter, File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_model.py", line 313, in fit ELBO += np.sum(get_binom_coeff(AD, DP)) File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_base.py", line 15, in get_binom_coeff _AD = AD[idx].astype(np.int64) File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 44, in getitem row, col = self._validate_indices(key) File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 147, in _validate_indices row, col = _unpack_index(key) File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_index.py", line 272, in _unpack_index return index.nonzero() File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_base.py", line 826, in nonzero A = self.tocoo() File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_compressed.py", line 1040, in tocoo return self._coo_container( File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_coo.py", line 204, in init self._check() File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/scipy/sparse/_coo.py", line 293, in _check raise ValueError('column index exceeds matrix dimensions') ValueError: column index exceeds matrix dimensions """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "", line 2, in File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 83, in vireo_wrap _models_all = [res.get() for res in result] File "/public/home/syhou/miniconda3/lib/python3.10/site-packages/vireoSNP/utils/vireo_wrap.py", line 83, in _models_all = [res.get() for res in result] File "/public/home/syhou/miniconda3/lib/python3.10/multiprocessing/pool.py", line 774, in get raise self._value ValueError: column index exceeds matrix dimensions

It is so sad,here is my information of vcf_data used : Number of samples: 9809 Number of SNPs: 14131 Number of INDELs: 0 Number of MNPs: 0 Number of others: 0 Number of sites: 14131

Can you help me , Thanks for replying

huangyh09 commented 2 months ago

Thanks for sharing the issue. While you shared the info in the VCF file, can you still try print(cell_dat['AD'].shape)?

hsymoon commented 2 months ago

print(cell_dat['AD'].shape)

Hello,here is the command output: print(cell_dat['AD'].shape) (14131, 9809) Thanks for helping me

huangyh09 commented 2 months ago

Thanks. This looks normal. Can you further try?

from vireoSNP.utils.vireo_base import get_binom_coeff, logbincoeff

np.sum(get_binom_coeff(AD, DP))

np.sum(logbincoeff(DP, AD, is_sparse=True))