novelty in QM9 dataset is so small, why?

cvignac / DiGress

code for the paper "DiGress: Discrete Denoising diffusion for graph generation"

MIT License

349 stars 73 forks source link

novelty in QM9 dataset is so small, why? #53

Closed FairyFali closed 1 year ago

FairyFali commented 1 year ago

This is the running results for QM9. I have two questions:

the running time is 9 hours, not 1 hour metioned in the paper, why?
why the novelty is so small?

cvignac commented 1 year ago

Hello, as explained in the table of the paper, " Training time is the time needed to reach 99% validity. "

Novelty is so small because QM9 is an exhaustive enumeration of molecules that satisfy some constraints. Cf section 5.4 of https://arxiv.org/pdf/2110.02096.pdf for a discussion of why high novelty is not a good thing for QM9