KevinMenden / scaden

Deep Learning based cell composition analysis with Scaden.
https://scaden.readthedocs.io
MIT License
71 stars 26 forks source link

ValueError: Input contains infinity or a value too large for dtype('float32'). #113

Closed mHagiw closed 3 years ago

mHagiw commented 3 years ago

Hi,

Thanks for your great work on scaden.

I've been recently trying to use scaden to recognize cell fractions. With the newly released Scaden v1.1.2, I tried to run the 'scaden process' command but resulted in the same error showing

INFO Scaling using log_min_max functions.py:65 Traceback (most recent call last): File "/home/mhagiwara/Programs/anaconda3/envs/scaden/bin/scaden", line 8, in sys.exit(main()) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/main.py", line 48, in main cli() File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/click/core.py", line 1137, in call return self.main(args, kwargs) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/click/core.py", line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/click/core.py", line 763, in invoke return __callback(args, kwargs) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/main.py", line 155, in process processing( File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/process.py", line 35, in processing preprocess_h5ad_data(raw_input_path=training_data, File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/model/functions.py", line 67, in preprocess_h5ad_data raw_input.X = sample_scaling(raw_input.X, scaling_option) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/scaden/model/functions.py", line 42, in sample_scaling x = mms.fit_transform(x.T).T File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/base.py", line 699, in fit_transform return self.fit(X, fit_params).transform(X) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 363, in fit return self.partial_fit(X, y) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 396, in partial_fit X = self._validate_data(X, reset=first_pass, File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/base.py", line 421, in _validate_data X = check_array(X, *check_params) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/utils/validation.py", line 720, in check_array _assert_all_finite(array, File "/home/mhagiwara/Programs/anaconda3/envs/scaden/lib/python3.9/site-packages/sklearn/utils/validation.py", line 103, in _assert_all_finite raise ValueError( ValueError: Input contains infinity or a value too large for dtype('float32').**

スクリーンショット 2021-09-01 15 05 37

My command for 'scaden process' is: scaden process tabula_MC38.h5ad dataset

I checked whether infinity data exist or not already.

We were successful when we changed only the genetic names. 2210010C04Rik (couldn't) ⇒ Foxp3 (OK!!)

May I have your suggestion on this issue? Thanks a lot!

Masaki

KevinMenden commented 3 years ago

Hi @mHagiw ,

sorry for the late response. This sounds indeed weird, I can't tell you right away how this happened. Did you figure it out by any chance, since you closed the issue?

mHagiw commented 3 years ago

Hi Kevin

I resolved my problem. I found inf in simulated data(.h5ad), not in bulk data.

So, I could run finally.

Thanks!!

Masaki Hagiwara

KevinMenden commented 3 years ago

Awesome, thanks for the explanation!