jsxlei / SCALE

Single-cell ATAC-seq analysis via Latent feature Extraction
MIT License
97 stars 17 forks source link

ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). #7

Open zji90 opened 4 years ago

zji90 commented 4 years ago

Hi, I encountered the following error when running SCALE. I have checked my data matrix. The maximum value is 35, the minimum is 0 and there is no na value. Just wondering how to fix the problem? Thanks!

Traceback (most recent call last): File "/home-4/zji4@jhu.edu/scratch/software/scale/SCALE/SCALE.py", line 134, in pred = model.predict(testloader, device) File "/scratch/users/zji4@jhu.edu/software/scale/SCALE/scale/model.py", line 97, in predict pred = kmeans.fit_predict(feature) File "/home-4/zji4@jhu.edu/.local/lib/python3.7/site-packages/sklearn/cluster/kmeans.py", line 998, in fit_predict return self.fit(X, sample_weight=sampleweight).labels File "/home-4/zji4@jhu.edu/.local/lib/python3.7/site-packages/sklearn/cluster/kmeans.py", line 972, in fit return_n_iter=True) File "/home-4/zji4@jhu.edu/.local/lib/python3.7/site-packages/sklearn/cluster/kmeans.py", line 312, in k_means order=order, copy=copy_x) File "/home-4/zji4@jhu.edu/.local/lib/python3.7/site-packages/sklearn/utils/validation.py", line 542, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home-4/zji4@jhu.edu/.local/lib/python3.7/site-packages/sklearn/utils/validation.py", line 56, in _assert_all_finite raise ValueError(msg_err.format(type_err, X.dtype)) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

jsxlei commented 4 years ago

See the note on the README page: If come across the nan loss, try another random seed filter peaks with harsher threshold, e.g. -x 0.04 or 0.06 filter low quality cells, e.g. --min_peaks 400 or 600 change the initial learning rate, e.g. --lr 0.0002

this error is not caused by your input data containing nan values, but due to the exploded gradient caused by some outlier samples in training process, filtering some low quality cells or rare peaks usually solve this issue.