jsxlei / SCALE

Single-cell ATAC-seq analysis via Latent feature Extraction
MIT License
97 stars 17 forks source link

Question about the missingness in scATAC-seq data #11

Open ttgump opened 4 years ago

ttgump commented 4 years ago

Hi, It is just a naive idea. According to the paper, you describe that scATAC-seq has the issue of "missingness" like the dropout in scRNA-seq data. So what if using a zero-inflated Bernoulli to characterize the scATAC-seq data (change the reconstruction loss from binary cross entropy to the likelihood of zero-inflated Bernoulli)? Like the deep count autoencoder [1] or scVI [2] model, they use zero-inflated negative binomial to characterize the scRNA-seq data.

[1] Eraslan, G., Simon, L. M., Mircea, M., Mueller, N. S., & Theis, F. J. (2019). Single-cell RNA-seq denoising using a deep count autoencoder. Nature communications, 10(1), 390. [2] Lopez, R., Regier, J., Cole, M. B., Jordan, M. I., & Yosef, N. (2018). Deep generative modeling for single-cell transcriptomics. Nature methods, 15(12), 1053.

jsxlei commented 4 years ago

It sounds like a good idea, maybe you could try that. However, adaptting our model on scRNA-seq with zero-inflated negative binomial with no improvements on the latent representation, neither the imputation. We don't know whether it will work on scATAC-seq.