Predicting Multimodal Distributions

Hi,

first of all, thanks for sharing your code! I'm trying to use this SFNN implementation to model multimodal distributions. I tested it with 2 datasets, but I can't obtain good results. Did I make any mistakes? Do you have any suggestions?

Network Settings

Layers: 4
- First deterministic
- Second hybrid (4 stochastic nodes)
- Third hybrid (4 stochastic nodes)
- Fourth deterministic
Layer size: 16
Importance sample: 30
Learning rate: 0.001
Epochs: 100
Mini-batch size: 100
Activation Function: sigmoid (for all layers/nodes)

Dataset A Same dataset used by Tang et al. Training data is in blue, predictions are in red. a plot

Log-probability across the epochs a log

Dataset: data.zip Code: main_A.py.txt

Dataset B Simple multimodal artificial dataset. Training data is in blue, predictions are in red. b plot

Log-probability across the epochs b log

Code: main_B.py.txt

Hi,

Thank you for being interested in this model. I have to say that the main goal of this repo was to keep my code safe and I was not thinking that anyone else would use it at any point, that is why it is quite messy.

First of all, I cannot remember if the code in the "master" branch is working correctly. I know that the one in the "develop" branch gave correct results. However, as you can imagine from the name, the code should be refactored to be easily used. Also it may contained some weird optional parameters that were handy for the project I was in.

Secondly, is it any reason why you use 4 layers with sigmoid activation functions? Have you tried out with less number of layers or with ReLUs? I am afraid that 4 layers + sigmoid may be causing a vanishing gradient problem in the fitting process. Also, I can tell you that the initialization of these architectures is quite important.

I will try to have a look at the code and let you know if I arrive to any conclusion.

pabaldonedo / stochastic_fnn

Predicting Multimodal Distributions #1