I am trying to run scaden simulate on a .h5ad object with ~21,000 cells and 25 cell types. I previously ran this on another .h5ad object successfullly.
I am using the following command:
scaden simulate --out /data/Deconvolution/Scaden/Output/ --cells 200 --n_samples 1000 --data /data/Deconvolution/Scaden/Input/ --data-format h5ad --pattern *.h5ad
However, I receive the following error:
INFO Datasets: ['data'] bulk_simulator.py:84
INFO Simulating data from data bulk_simulator.py:89
INFO Loading data dataset ... bulk_simulator.py:141
INFO Merging unknown cell types: ['unknown'] bulk_simulator.py:107
INFO Subsampling data ... bulk_simulator.py:110
Traceback (most recent call last):
File "/data/anaconda/envs/scaden/bin/scaden", line 8, in <module>
sys.exit(main())
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/__main__.py", line 48, in main
cli()
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/__main__.py", line 207, in simulate
simulation(
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulate.py", line 22, in simulation
bulk_simulator.simulate()
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 90, in simulate
self.simulate_dataset(dataset)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 114, in simulate_dataset
tmp_x, tmp_y = self.create_subsample_dataset(
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 253, in create_subsample_dataset
sample, label = self.create_subsample(x, y, celltypes)
File "/data/anaconda/envs/scaden/lib/python3.8/site-packages/scaden/simulation/bulk_simulator.py", line 305, in create_subsample
cells_fraction = np.random.randint(0, cells_sub.shape[0], samp_fracs[i])
File "mtrand.pyx", line 748, in numpy.random.mtrand.RandomState.randint
File "_bounded_integers.pyx", line 1247, in numpy.random._bounded_integers._rand_int64
ValueError: high <= 0
Hi Kevin,
Thank you for the great package.
I am trying to run scaden simulate on a .h5ad object with ~21,000 cells and 25 cell types. I previously ran this on another .h5ad object successfullly.
I am using the following command:
scaden simulate --out /data/Deconvolution/Scaden/Output/ --cells 200 --n_samples 1000 --data /data/Deconvolution/Scaden/Input/ --data-format h5ad --pattern *.h5ad
However, I receive the following error:
My matrix of the input .h5ad looks like this:
adata[0:5,0:5].X.todense()
[0. , 0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0. , 0. ], [0. , 0. , 0. , 0. , 0. ], [0. , 0. , 0.9539254, 1.9078507, 0. ], [0. , 0. , 0. , 4.070004 , 0. ]Please could you let me know if you know how I might be able to fix this.
Additionally, do you have any advice on how to select the --cells and --n_samples parameters or can these generally be kept as the default values?
Many thanks, Elise