Discrepancy for MOSES dataset evaluation protocol

qiyan98 commented 1 year ago

Hi,

I notice the number of molecules to generate for evaluation on MOSES dataset is 25000, as specified in the config file. https://github.com/cvignac/DiGress/blob/150ca149394ddbb32e855f4092b8dc1acdfce8f7/configs/experiment/moses.yaml#L17

The number of molecues are also 25000 in your shared SMILES samples: https://github.com/cvignac/DiGress/blob/main/generated_samples/generated_smiles_moses.txt.

However, the original MOSES paper suggests using 30000 generated samples for evaluation. Snapshot:

Source: https://arxiv.org/pdf/1811.12823.pdf#page=3

I'm new to this dataset and feel confused about the discrepancy. Can you explain why we choose 25000 instead of 30000?

Thanks, Qi

cvignac commented 1 year ago

If you check the code of MOSES, I think that internally it uses 20000 valid samples to compute metrics. Since we can get enough valid molecules by sampling 25k molecules, we did not sample more.

qiyan98 commented 1 year ago

Got it. Thanks!

cvignac / DiGress

Discrepancy for MOSES dataset evaluation protocol #61