bm2-lab / scMVP

MIT License
28 stars 11 forks source link

Questions about the imputed values #5

Closed hongruhu closed 2 years ago

hongruhu commented 2 years ago

Hi, scMVP is a very interesting paper and thank you for your work first. Yet, I have some questions regarding to the imputed_values when I ran theimputation() module.

Q1: could you explain more about each of the 4 sets of matrix in the imputed_values? the first one is [N x Gene], the second one is [N x Peak], the third one is [N, ], and the last one is also [N x Peak], and what's the difference between the 2nd and the 4th?

Q2: if there's a way to use another held-out set (same format as the training set) to test the trained-model?

Q3: if there's a way impute modality B (e.g. RNA) only with a unimodal modality A (e.g. unimodal ATAC)

Q4: I'm a little bit confused about the n_epochs in the training step trainer.train(n_epochs= n, lr=lr), in the demo notebook (https://github.com/bm2-lab/scMVP/blob/master/demos/manuscript_analysis/snare_cellline_demo.ipynb), n_epochs= 10, so I assumed the training steps are 10, but during the training curve plotting, I saw x = np.linspace(0, 500, (len(elbo_train_set))), so I was wondering what the 500 here means?

Thank you very much in advance!

adamtongji commented 2 years ago

Hi Hongru,

Q1. The first is rna imputation matrix; the second one is atac imputation matrix; the third one is the cell labels. And the 4th matrix was only used for testing.

Q2& Q3 Really interesting points! scMVP is not designed or tested for these functions. You may refer to other related tools or papers with large dataset transferring model or atac-to-rna imputation model.

Q4 The "500" is not relate to the training epochs or iterations. You can set it to any value or more meaningful value you prefer.