Closed siboehm closed 2 years ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
I really like this setup! Also the additional README.md is nice :)
Experiments will be completed by tomorrow I assume, atm the status is
********** Report for database collection 'lincs_rdkit_hparam' **********
* - 0 staged experiments
* - 4 pending experiments
* - 10 running experiments
* - 11 completed experiments
* - 0 interrupted experiments
* - 0 failed experiments
* - 0 killed experiments
*************************************************************************
I think, we should then add the corresponding results notebook and the PR is good to go?
Hi @siboehm, I analysed the sweep for lincs_rdkit_hparams
, we have quite good models among them. I change the default plotting function to violin plots as show both the distribution and the individual runs.
Turns out that the best performing model performs best wrt to all our selection criteria: perturbation disentanglement
, test_mean
, and test_mean_de
I am correct, that the disentanglement score now is non-linear, right? Scores have worsened quite a bit overall.
Do you think that we should just take the hparams
of this best perfomring model and start the 2nd part of the experiment?
Yes the disentanglement score is now calculated using a non-linear model, this should make score worse (=higher) overall. Lets hope that the hparams have a large influence on the disentanglement score, instead of good scores being due to getting lucky during the adversarial training.
Yes we should start the 2nd part, I'll write a new yaml. The top configurations look quite dissimilar surprisingly (the 1. model is comparatively small, whereas the 3. model is pretty big). I'd just follow Niki's advice and pick the top performing hparams without thinking about it too much. It even has latent size 32!
I picked the best hparams for the autoencoder and the adversarial. The parameters of the drug embedding and doser are being randomly sweeped using the same range as before. I'm also sweeping the step_size_lr
again, as it applies to all optimizers (AE + Adv + drug embedder + doser).
@MxMstrmn Can you have a brief look? Mainly at the list of embeddings.
Fun facts:
dropout
hparam is actually used anywhere, I can't find any references to it in the code.Fun facts: I don't think the dropout hparam is actually used anywhere, I can't find any references to it in the code.
Yet another Code Gem 💎
@siboehm, I will edit the config and start the run tomorrow morning most likely.
we'd obviously still need to make the final plots for the paper, but I think in terms of results we have all we need for this experiment.
We'll still have to update the notebook with the final results, but there's some code changes in this PR that should make it into main
soon, so I'm merging this.
Ref #69
Draft PR for now, just to show what the new folder structure would look like. Results are going into their own mongoDB collection, called
lincs_rdkit_hparam
.