Open micheletufano opened 7 years ago
Hi,
Thank you for being interested in this model. I have to say that the main goal of this repo was to keep my code safe and I was not thinking that anyone else would use it at any point, that is why it is quite messy.
First of all, I cannot remember if the code in the "master" branch is working correctly. I know that the one in the "develop" branch gave correct results. However, as you can imagine from the name, the code should be refactored to be easily used. Also it may contained some weird optional parameters that were handy for the project I was in.
Secondly, is it any reason why you use 4 layers with sigmoid activation functions? Have you tried out with less number of layers or with ReLUs? I am afraid that 4 layers + sigmoid may be causing a vanishing gradient problem in the fitting process. Also, I can tell you that the initialization of these architectures is quite important.
I will try to have a look at the code and let you know if I arrive to any conclusion.
Hi,
first of all, thanks for sharing your code! I'm trying to use this SFNN implementation to model multimodal distributions. I tested it with 2 datasets, but I can't obtain good results. Did I make any mistakes? Do you have any suggestions?
Network Settings
Dataset A Same dataset used by Tang et al. Training data is in blue, predictions are in red.
Log-probability across the epochs
Dataset: data.zip Code: main_A.py.txt
Dataset B Simple multimodal artificial dataset. Training data is in blue, predictions are in red.
Log-probability across the epochs
Code: main_B.py.txt