greenelab / tybalt

Training and evaluating a variational autoencoder for pan-cancer gene expression data
BSD 3-Clause "New" or "Revised" License
162 stars 61 forks source link

Add Two Hidden Layer Model #81

Closed gwaybio closed 6 years ago

gwaybio commented 6 years ago

Fairly large Pull Request adding a new jupyter notebook that trains two distinct two hidden layer Tybalt models. The notebook is added in e82c365 and nbconverted script in 61bbb1c. The biggest change is the addition of a Tybalt class that does most of the heavy lifting. This partially addresses #74 but not completely since the class cannot be imported yet.

The two models have architectures:

  1. 5000 -> 100 -> 100 -> 100 -> 500
  2. 5000 -> 300 -> 100 -> 300 -> 500

They are trained in e82c365 with the optimal hyperparameters uncovered in #71 and #77

The rest of the files are all output files generated from the notebook including:

Description Format Git LFS? Commit
The actual encoder and decoder keras models .hdf5 :white_check_mark: 9883611
The latent feature matrices of both models .tsv.gz :white_check_mark: 3dce40f
The gene weight matrices for both models .tsv :white_check_mark: 3cc4e9b
Training history figure + architecture .png/.pdf :x: f346f65

I also update the git LFS .gitattributes file in 46f113f to automatically track any .hdf5 file saved in the models/ folder (shout out to @dhimmel here! :smiling_imp: ).