Features that represent biological signals

greenelab / tybalt

Training and evaluating a variational autoencoder for pan-cancer gene expression data

BSD 3-Clause "New" or "Revised" License

162 stars 62 forks source link

Features that represent biological signals #150

Closed ZohrehShams closed 4 years ago

ZohrehShams commented 5 years ago

Hi, I’m interested in the part that you have investigated the biological relevance of extracted latent features. However, I have not been bale to figure where you find the dimensions that are say most important in distinguishing patients' sex. It appears that this is feature 82. Is this feature 82 (out of 100) in the latent space of VAEs with one hidden layer, or two hidden layer? I’m guessing it’s the latter, which is why I was looking into tybalt_twohidden.ipynb. However, here only weights for all samples are extracted based on multiplying the weights of the two hidden layers. So the question I was hoping you can help me with is in the vae with two hidden layer, how the latent feature(s) that distinguishes patients's sex is identified as 82?

gwaybio commented 5 years ago

Hi @ZohrehShams - thanks for your interest in our project! I will answer your questions below.

Is this feature 82 (out of 100) in the latent space of VAEs with one hidden layer, or two hidden layer?

The model with one hidden layer. This is the model presented in the PSB paper.

how the latent feature(s) that distinguishes patients's sex is identified as 82?

I used a brute force approach for this particular case - this shiny app (also documented in the paper) helped https://gregway.shinyapps.io/pancan_plotter/

Brute force is not satisfying and is not scalable. We also developed this method which may also be of interest.

ZohrehShams commented 5 years ago

Thanks very much for your reply and the clarification. Then in tybalt_two hidden is it possible to extract gene's weight for specific targets (gender = male)? As it stands now the count_high_weight_genes which is based on extracted weights, does the counting for all samples. Thanks very much.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.