greenelab / czi-rfa

Application to "Collaborative Computational Tools for the Human Cell Atlas" https://chanzuckerberg.com/initiatives/rfa
Other
6 stars 9 forks source link

Use zero-inflated reconstruction loss #11

Closed chrisprobert closed 7 years ago

chrisprobert commented 7 years ago

Single-cell RNA-seq data is sparse/zero-inflated. There are several works [1-3] showing advantages of modeling zero-inflated distances between cells over L-1/L-2 distances.

For the VAE in aim 1, I expect that using zero inflated loss rather than euclidean reconstruction loss would work significantly better.

[1] https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0805-z [2] https://arxiv.org/abs/1610.05857 [3] https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1188-0

cgreene commented 7 years ago

Closed with #13. Thanks a ton @chrisprobert for the suggestion! I have included your name + a citation to this issue in the proposal. Please let me know by Sunday if you would prefer that I remove it.

gwaybio commented 6 years ago

A recent arxiv paper describes this idea, with code implemented in tensorflow here.

Note, the model is evaluated with 720 genes

chrisprobert commented 6 years ago

Looks awesome!

On Fri, Sep 8, 2017 at 12:58 PM Greg Way notifications@github.com wrote:

A recent arxiv paper https://arxiv.org/abs/1709.02082 describes this idea, with code implemented in tensorflow here https://github.com/YosefLab/ZINB-VAE.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/greenelab/czi-rfa/issues/11#issuecomment-328199516, or mute the thread https://github.com/notifications/unsubscribe-auth/AC1HHZHwyEZolRcYyQCsVuEldn7f89q2ks5sgZxIgaJpZM4PArSW .