oriondollar / TransVAE

A Transformer Based VAE Architecture for De Novo Molecular Design
MIT License
88 stars 23 forks source link

Update loss.py #3

Open PyeongKim opened 3 years ago

PyeongKim commented 3 years ago

Dimension of mu and logvar is (Batch x d_latent). If we just calculate mean of all as previously implemented, this would result in mean of all d_latent regardless of individual data. Instead, we have to make sum of latent representation, and then get mean along batch. I think it is more close to original meaning of KL Divergence.

PyeongKim commented 3 years ago

ps. I write this comment since I am truly interested in your research, and based on your research, I am expanding some concept of exploration. Thank you for your contribution. image

oriondollar commented 3 years ago

I think you're right but I'm going to hold off on merging for now until I have a chance to test the behavior myself. I think this bug may have actually become a feature in some ways. I see you've forked so I assume you've already modified it in your forked version and can still run all the code you want?

Also I'm excited to see how you expand on the concept of exploration! I'm working on an update that will include the option to append a set of property predictor layers to the latent space. Depending on how you are approaching it, this could give you a way to probe exploration that is not 100% reliant on purely structural fingerprints.

PyeongKim commented 3 years ago

Yes it is running fine! :)