rvinas / GTEx-imputation

Gene Expression Imputation with Generative Adversarial Imputation Nets
MIT License
11 stars 3 forks source link

Are hyperparameters default for "R2 imputation scores" in figure 2? Should checkpoint.restore(...) help checking intermediate model results? #4

Closed yezhengli-Mr9 closed 4 years ago

yezhengli-Mr9 commented 4 years ago

Hi Ramon, (1) Are hyperparameters default for "R2 imputation scores" in Fig. 2? I hope they are but I typically run at most one hour with patience decreases to 0 while boxplots are far worse.

(2) I noticed there was no checkpoint.restore(...) but checkpoint.restore(...) should help checking intermediate model results (for the 9+ hour training), correct? I just add checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir)) according to offical tutorial for checkpoint.restore(...).

(3) I have not set np.nan zero but force corresponding entries of masks zero. I hope I can still get similar R2 imputation scores as ones in Fig. 2.

rvinas commented 4 years ago

(1) Yes, it should be possible to get the R2 imputation scores with the default hyperparameters. Those scores were obtained for a missing rate of 0.5 and using standardised data (e.g. each gene has mean 0 and std 1, and the per-gene distributions are approximately normal). (2) There is no need for checkpoint.restore, but you can if you want. We ended up not using checkpoints because of an error in our machine that stopped the training process. (3) I suppose this refers to the function that computes the R2 scores. It should be fine as long as you only take into account the imputed components to compute the scores.

yezhengli-Mr9 commented 4 years ago

(1) Yes, it should be possible to get the R2 imputation scores with the default hyperparameters. Those scores were obtained for a missing rate of 0.5 and using standardised data (e.g. each gene has mean 0 and std 1, and the per-gene distributions are approximately normal). (2) There is no need for checkpoint.restore, but you can if you want. We ended up not using checkpoints because of an error in our machine that stopped the training process. (3) I suppose this refers to the function that computes the R2 scores. It should be fine as long as you only take into account the imputed components to compute the scores.

(1) yes, I know it is "0.5 missing rate", etc. OK, I think I keep x = standardize(x) here already.

(2) OK, although there is not bug in checkpoint.restore(...) here, my boxplots for data and entiring procedure (without replacing np.nan by zero) is not satisfying;

(3) Let me take a detailed look into your "function that computes the R2 scores.". (initially I thought my version is similar).

Thanks, let me try more times.