KlugerLab / ALRA

Imputation method for scRNA-seq based on low-rank approximation
MIT License
73 stars 19 forks source link

In nrow(A_norm) * ncol(A_norm) : NAs produced by integer overflow #30

Open haoqing12 opened 5 months ago

haoqing12 commented 5 months ago

I am working with a particularly large dataset consisting of 2,423,133 cells and 1,091 genes. Using the ALRA, I ran into repeated warnings about integer overflow. Here’s the warning:

r$> A_norm_completed <- alra(A_norm,k=k_choice$k)[[3]] Read matrix with 2423133 cells and 1091 genes Getting nonzeros Randomized SVD Find the 0.001000 quantile of each gene Sweep Scaling all except for 0 columns NA% of the values became negative in the scaling process and were set to zero The matrix went from NA% nonzero to NA% nonzero Warning messages: 1: In nrow(A_norm) ncol(A_norm) : NAs produced by integer overflow 2: In nrow(A_norm) ncol(A_norm) : NAs produced by integer overflow 3: In nrow(A_norm) * ncol(A_norm) : NAs produced by integer overflow

截屏2024-04-30 16 38 17

There doesn't seem to be a good complement of zeros.

Could you please provide any suggestions on how to mitigate this issue? Is there a recommended approach to handling large datasets with your software, or perhaps a parameter adjustment that I might not be aware of?