choderalab / pinot

Probabilistic Inference for NOvel Therapeutics
MIT License
15 stars 2 forks source link

Updating generative model to work more seamlessly with net + background data #50

Closed dnguyen1196 closed 4 years ago

dnguyen1196 commented 4 years ago

Unlabelled data set

Added the zinc and moses data set, together with their tiny and small versions as pinot.data.zinc_tiny, pinot.data.moses_tiny. Also modified pinot.data.utils.batch to handle both the unlabeled data (zinc, moses) and labeled data (esol).

CLI tool

Added more functionalities to pinot.app.cli.py which can now either load pretrained generative model or to train a new generative model using background data. This also means that generative model now works with Net. One can initialize Net with a trained or newly initialized GCNModelVAE.

pinot.app.cli.py can be used as a command line tool as follows:

usage: cli.py [-h] [--pretrained_gen_model PRETRAINED_GEN_MODEL]
              [--layer LAYER]
              [--hidden_dims_gvae HIDDEN_DIMS_GVAE [HIDDEN_DIMS_GVAE ...]]
              [--embedding_dim EMBEDDING_DIM]
              [--background_data BACKGROUND_DATA]
              [--n_epochs_generative N_EPOCHS_GENERATIVE]
              [--optimizer_generative OPTIMIZER_GENERATIVE]
              [--lr_generative LR_GENERATIVE] [--save_model SAVE_MODEL]
              [--free_gradient] [--noise_model NOISE_MODEL]
              [--optimizer OPTIMIZER] [--out OUT] [--data DATA]
              [--batch_size BATCH_SIZE] [--lr LR] [--partition PARTITION]
              [--n_epochs N_EPOCHS]

With pretrained generative model::
  --pretrained_gen_model PRETRAINED_GEN_MODEL
                        Parameter file of pretrained generative model

With no pretrained generative model:
  --layer LAYER         Type of graph convolution layer
  --hidden_dims_gvae HIDDEN_DIMS_GVAE [HIDDEN_DIMS_GVAE ...]
                        Hidden dimensions of the convolution layers
  --embedding_dim EMBEDDING_DIM
                        Embedding dimension (dimension of the encoder's
                        output)
  --background_data BACKGROUND_DATA
                        Background data to pre-train generative model on
  --n_epochs_generative N_EPOCHS_GENERATIVE
                        Number of epochs of generative model pre-training
  --optimizer_generative OPTIMIZER_GENERATIVE
                        Optimizer for generative model pre-training
  --lr_generative LR_GENERATIVE
                        Learning rate for generative model pre-training
  --save_model SAVE_MODEL
                        File to save generative model to

Net settings:
  --free_gradient       Allow for updating gradients of pretrained generative
                        model
  --noise_model NOISE_MODEL
                        Noise model for predictive distribution
  --optimizer OPTIMIZER
                        Choice of ptimizer
  --out OUT             Folder to print out results to
  --data DATA           Data set name
  --batch_size BATCH_SIZE
                        Batch size
  --lr LR               Learning rate
  --partition PARTITION
                        Training-testing split
  --n_epochs N_EPOCHS   Number of epochs
lgtm-com[bot] commented 4 years ago

This pull request introduces 1 alert when merging 30bedd8ab55eef67cb01b45a9a4468941d24a500 into 0548dcd7b78be04c9a953614e4d4a78ef0624170 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 1 alert when merging c6bf44477463ac58d1a71200450502ce67865613 into 0548dcd7b78be04c9a953614e4d4a78ef0624170 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging ce453b8f5a286307a775f61753508bce3834df1f into 0548dcd7b78be04c9a953614e4d4a78ef0624170 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging faeaf7890ec0a33d33311a5ce095001c1501d865 into 0548dcd7b78be04c9a953614e4d4a78ef0624170 - view on LGTM.com

new alerts: