Scaden simulate ValueError: low >= high

mHagiw commented 3 years ago

Hi,

Thanks for your great work on scaden.

I've been recently trying to use scaden to recognize cell fractions. With the newly released Scaden v1.1.2, I tried to run the 'scaden simulate' command but resulted in the same error showing

File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/simulation/bulk_simulator.py", line 305, in create_subsample cells_fraction = np.random.randint(0, cells_sub.shape[0], samp_fracs[i]) File "mtrand.pyx", line 747, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1254, in numpy.random._bounded_integers._rand_int64 ValueError: low >= high

My command for 'scaden simuate' is: scaden simulate -n 100 --data ./scanpy --pattern "_counts.txt"

The count data file '_counts.txt' has been generated following the normalization steps of your example in jupyter script, containing 19770 genes (column) and 83085 cells (row). Formatted as follows.

count スクリーンショット 2021-08-20 17 38 18

celltype スクリーンショット 2021-08-20 17 39 10

I checked row number & column number in both data.

I suppose something is wrong with the format, but have tried several times and couldn't figure out the reason. May I have your suggestion on this issue? Thanks a lot!

Masaki

KevinMenden commented 3 years ago

Hi @mHagiw ,

thanks for reporting this issue! I will try to have a look at it as soon as possible - hopefully this weekend :)

Best, Kevin

mHagiw commented 3 years ago

Thanks for quite quick response!

I am waiting expectantly.

I tried to minimize my data. In result, df.head(4000) was passed but df.head(5000) not passed.

So, I suspect 4001~5000 row have any error.

Best regard, Masaki

mHagiw commented 3 years ago

Hi,

Sorry, I found my problem of my df.

Celltypes df have some NaN.

I solve problem. I'm able to try the next steps with Scaden now!

Thanks. Masaki

KevinMenden / scaden

Scaden simulate ValueError: low >= high #109