navinlabcode / copykat

Other
213 stars 55 forks source link

Meaning of dV=0.16, dW=0.001 when smoothing data #37

Closed ptdtan closed 2 years ago

ptdtan commented 3 years ago

Dear authors,

I saw at https://github.com/navinlabcode/copykat/blob/420a216eb149701ab96a2891ccefc170c434dbdd/R/copykat.R#L103-L104 you used the dlm package to smooth the expression value of one gene by its surrounding genes. That is an interesting method compared to the averaging-out method of inferCNV (as it stated in the publication).

But how did you come up with the parameters dV=0.16, dW=0.001 in the modeling function? And what do they mean in the context of different sequencing technologies?

Thank you.

gaobio commented 2 years ago

dlm::dlmModPoly(order=1, dV=0.16, dW=0.001)

We ran many tests with different parameters. The default values gave us better results when tested on 10x genomics 3'sc-RNAseq data from triple negative cancer tumors. It worth the effort of testing the parameters for data obtained from other platforms, particularly when their count data distributions are significantly different from 3'scRNAseq data.