sunduanchen / Scissor

Scissor package
GNU General Public License v3.0
169 stars 30 forks source link

invalid 'ncol' value (too large or NA) #9

Closed MTutino closed 2 years ago

MTutino commented 2 years ago

Hi,

I am trying to use Scissor with family "binomial" on a dataset of bulk RNAseq with 43 samples (11 controls and 32 cases) and publicly available scRNAseq with 5092 cells. I receive the following error message

[1] "|**|" [1] "Performing quality-check for the correlations" [1] "The five-number summary of correlations:" 0% 25% 50% 75% 100% 0.3543059 0.4483379 0.4650750 0.4824858 0.6122901 [1] "|**|" [1] "Current phenotype contains 11 healthy and 32 asthmatics samples." [1] "Perform logistic regression on the given phenotypes:" Error in matrix(NA, nrow = nfolds, ncol = numi2) : invalid 'ncol' value (too large or NA) In addition: Warning messages: 1: In Scissor(mat, sc_dataset, as.matrix(pheno), tag = tag, alpha = 0.2, : NAs introduced by coercion 2: In matrix(sapply(outi, function(x) { : data length [10] is not a sub-multiple or multiple of the number of rows [5093] 3: In matrix(sapply(outi, function(x) { : data length [10] is not a sub-multiple or multiple of the number of rows [5093]

Could you please help?

Thank you

sunduanchen commented 2 years ago

Can you share me your data and let me have a try?

Thanks, Duanchen

MTutino commented 2 years ago

I found out what the issue was. I was using the binomial test and I was supplying the phenotype as a 2-column matrix, first column was sample IDs and second column was the phenotype, instead of a vector of binary phenotypes. Scissor started fine, it returned the five-number summary of correlations and the number of cases and controls. Only after starting the logistic regression it would fail with the error reported above. I only understood where the issue was when I used the example data from the tutorial. I think it would be helpful to specify what the phenotype input should be in the tutorial and, possibly, exit with an error if the wrong input type is used. Thank you for the great tool though!

sunduanchen commented 2 years ago

Glad to know you work it out. I will make it clear and make a new update based on your suggestion.

Thanks a bunch!!

chenx9 commented 1 year ago

Hello, I have the same problem, the following is my phenotype information and code, I don't know where the problem is. I hope you can help me. Thank you very much!

table(phenotype) phenotype 0 1 47 33 head(phenotype) GSM907792 GSM907793 GSM907794 GSM907795 GSM907796 GSM907797 1 1 1 1 1 1

infos4 <- Scissor(bulk_dataset, sc_dataset , phenotype, tag = tag, alpha = 0.5,

liukun2463 commented 1 year ago

Hello, I have the same problem. Have you solved this problem? Could you help me? Thank you very much!

phenotype=read.table(inputFile2,sep="\t",header=T,check.names=F) phenotype=as.matrix(phenotype) phenotype class(phenotype) phenotype=apply(phenotype,2,as.numeric)#将matrix转化成numeric GSM4748450 GSM4748451 GSM4748452 GSM4748453 GSM4748454 GSM4748455 [1,] 0 0 0 0 0 0 GSM4748456 GSM4748457 GSM4748458 GSM4748459 GSM4748460 GSM4748461 [1,] 0 0 0 0 0 0 GSM4748462 GSM4748463 GSM4748464 GSM4748465 GSM4748466 GSM4748467 [1,] 0 0 1 1 1 1 GSM4748468 GSM4748469 GSM4748470 GSM4748471 GSM4748472 GSM4748473 [1,] 1 1 1 1 1 1 GSM4748474 GSM4748475 GSM4748476 GSM4748477 GSM4748478 GSM4748479 [1,] 1 1 1 1 1 1 GSM4748480 GSM4748481 GSM4748482 GSM4748483 GSM4748484 GSM4748485 [1,] 1 1 1 1 1 1 GSM4748486 GSM4748487 GSM4748488 [1,] 1 1 1 [1] "matrix" "array"

tag <- c('control', 'obesity') infos1 <- Scissor(bulk_dataset, sc_dataset1, phenotype, tag = tag, alpha = 0.2, family = "binomial", Save_file = "Scissor.RData") [1] "|**|" [1] "Performing quality-check for the correlations" [1] "The five-number summary of correlations:" 0% 25% 50% 75% 100% 0.04852072 0.25830852 0.27764791 0.30061563 0.54160975 [1] "|**|" [1] "Current phenotype contains 14 control and 25 obesity samples." [1] "Perform logistic regression on the given phenotypes:" Error in matrix(NA, nrow = nfolds, ncol = numi2) : invalid 'ncol' value (too large or NA) In addition: Warning messages: 1: In asMethod(object) : sparse->dense coercion: allocating vector of size 4.0 GiB 2: In asMethod(object) : sparse->dense coercion: allocating vector of size 18.4 GiB 3: In Scissor(bulk_dataset, sc_dataset1, phenotype, tag = tag, alpha = 0.2, : NAs introduced by coercion 4: In matrix(sapply(outi, function(x) { : data length [10] is not a sub-multiple or multiple of the number of rows [49701] 5: In matrix(sapply(outi, function(x) { : data length [10] is not a sub-multiple or multiple of the number of rows [49701]