Closed siboehm closed 2 years ago
I think we should change the name since we do not sweep over the embeddings anymore - these are fixed by the best performing model per drug embedder on LINCS.
So just sciplex_hparam
? Trapnell actually is the name of the PI who led the experiment.
@siboehm, we might have to adjust the split on which we finetune and find the hyper parameters.
Hm.
0.023
, which is the optimal. This may mean we have to go down with the adversarial penalty (go < 1), since the model may be focused too much on that lowering that penalty, instead of lowering the reconstruction. I'd probably try to test this out using one embedding only at first.Didn't yet see any of the Vanilla runs, I wonder how they turn out. I guess if this is a hold-out split then Vanilla won't work anyway.
Didn't yet see any of the Vanilla runs, I wonder how they turn out. I guess if this is a hold-out split then Vanilla won't work anyway.
It is not a real hold out, but just some drugs combined with the highest dosage are put as 'ood'/'test'.
I think the vanilla runs are last in the list... something to wait for.
I attached the plot for the training metrics to show that for DE genes, pre-trained models seem to perform better. Again preliminary as not all runs are finished yet.
As it stands there's no parameter finetuning in #85 or #84.
autoencoder_lr
,autoencoder_wd
batch_size
We should do this separately for the finetuned and from-scratch training.
dosers_lr
&dosers_wd
autoencoder_lr
(autencoder + drug embedder is updated using the same optimizer).I think (1) is important, as a good lr may make a difference for finetuning and as the classification task for the adversaries is pretty different on Trapnell. (2) is probably much less important, we could use it as a source of variation during the individual runs.
lincs_rdkit_hparam
. Rdkit embedding.