about Optimum_KernelC error

hfl112 commented 3 years ago

Hi Developers, I tried to use DeMixT on my data. At first it had some errors at "if (sum(id1) < 20) ", so I change the filter.sd to 0.8. But then, I got an error like this:

Error in if (sum(obj == 0) > 1) { : missing value where TRUE/FALSE needed
Calls: DeMixT_GS -> Optimum_KernelC
Execution halted

no matter how I change the ngene.selected.for.pi and ngene.Profile.selected, I always got this error.

Any advice to tune the parameters? or should I filter out some genes from my input. Thanks

ShaolongCao commented 3 years ago

Hi @hfl112

First, can I ask what is the data type and how many genes and samples for tumor and normal reference? So I can inform you about parameter tuning.

According to the error message, I believe there is something wrong with the input data, do you want to double-check if there is any NAs or negative values? Please let me know if you have more questions.

Best, Shaolong

hfl112 commented 3 years ago

Hi @hfl112

First, can I ask what is the data type and how many genes and samples for tumor and normal reference? So I can inform you about parameter tuning.

According to the error message, I believe there is something wrong with the input data, do you want to double-check if there is any NAs or negative values? Please let me know if you have more questions.

Best, Shaolong

Hi Shaolong, Thank you for the reply. I use nneg to remove the negative ones. And change all the na to 0 as well. The data have almost 20k genes, 200 tumor cases and 20+ normal cases. Is the normal sample size too small? Can I use the GTEx data as a reference?

Best, Funan

ShaolongCao commented 3 years ago

Hi @hfl112

20+ normal samples is enough. We recommend to use adjacent normal samples as reference. But if you want to use GTEx data as normal reference, we recommend to perform scale normalization of the GTEx data together with your tumor data.

Best, Shaolong

hfl112 commented 3 years ago

Hi @hfl112

20+ normal samples is enough. We recommend to use adjacent normal samples as reference. But if you want to use GTEx data as normal reference, we recommend to perform scale normalization of the GTEx data together with your tumor data.

Best, Shaolong

you mean z-score? would that generate more negative values ?

ShaolongCao commented 3 years ago

Hi @hfl112

We recommend to normalize normal reference and mixed sample together, so that they are in the same scale. It has been shown that using scaled normalization for mixed sample and normal reference sample together would yield robust estimation.

Quantile.Normalization.scale<-function(Count.matrix){ newt <- Count.matrix colnames(newt)=NULL rownames(newt)=NULL

designs=c(rep("0", dim(Count.matrix)[2])) seqData=newSeqCountSet(as.matrix(newt), designs) seqData=estNormFactors(seqData, "quantile") k3=seqData@normalizationFactor mk3=median(k3) k3=k3/mk3

temp<-newt

for(i in 1:ncol(newt)){ temp[,i] = temp[,i]/k3[i] } Count.matrix.normalized<-temp colnames(Count.matrix.normalized)<-colnames(Count.matrix) rownames(Count.matrix.normalized)<-rownames(Count.matrix)

return(Count.matrix.normalized) }

Quantile.Normalization.scale(cbind(data.Y, data.N1))

Note, the data (i.e, data.T for observed tumor expression, data.N1 for normal reference) needs to be in matrix format while normalizing it before encode it into SummarizedExperiment format.

Best, Shaolong

hfl112 commented 3 years ago

Hi @hfl112

We recommend to normalize normal reference and mixed sample together, so that they are in the same scale. It has been shown that using scaled normalization for mixed sample and normal reference sample together would yield robust estimation.

Quantile.Normalization.scale<-function(Count.matrix){ newt <- Count.matrix colnames(newt)=NULL rownames(newt)=NULL

designs=c(rep("0", dim(Count.matrix)[2])) seqData=newSeqCountSet(as.matrix(newt), designs) seqData=estNormFactors(seqData, "quantile") k3=seqData@normalizationFactor mk3=median(k3) k3=k3/mk3

temp<-newt

for(i in 1:ncol(newt)){ temp[,i] = temp[,i]/k3[i] } Count.matrix.normalized<-temp colnames(Count.matrix.normalized)<-colnames(Count.matrix) rownames(Count.matrix.normalized)<-rownames(Count.matrix)

return(Count.matrix.normalized) }

Quantile.Normalization.scale(cbind(data.Y, data.N1))

Note, the data (i.e, data.T for observed tumor expression, data.N1 for normal reference) needs to be in matrix format while normalizing it before encode it into SummarizedExperiment format.

Best, Shaolong

Thanks Shaolong, I'll try this. Have a good weekend, Funan

wwylab / DeMixT

about Optimum_KernelC error #16