ehoogeboom / e3_diffusion_for_molecules

MIT License
432 stars 113 forks source link

Why set the normalize value and bias? #1

Closed Layne-Huang closed 2 years ago

Layne-Huang commented 2 years ago

Could you please give me some insights about the reason why you implemented normalize values and bias and how you determine their values?

ehoogeboom commented 2 years ago

Yes, of course. We ended up not using the bias, but the normalize values are used for h_onehot (the atom type) and h_int (the charge). They correspond to the paragraph titled "Scaling Features" in the paper. Scaling the features changes the diffusion process, and therefore the generative process that needs to be learned.

Essentially, we found that multiplying h_onehot by 0.25 gave a significant performance benefit in the stability metrics. In code we achieve this my giving the appropriate "normalize_values". In experiments we found that the model trained on this 0.25 setting scores much higher on the stability metrics.

We think that it is easier for the model to first get a rough estimate on x (positions, and then decide h (type & charge). By multiplying h by 0.25, the denoising process focuses more on x in the early stages of the generative process, and only later on h.