ehoogeboom / e3_diffusion_for_molecules

MIT License
408 stars 110 forks source link

Question regarding the dimension of QM9 and input to property classifier #22

Closed yogeshverma1998 closed 7 months ago

yogeshverma1998 commented 1 year ago

Hi,

I have a question about the number of maximum atoms used for QM9 generation. The maximum number of atoms for the QM9 dataset is 9, but when running the code, it used (29,5) as input to the property classifier, Can you let me know where are the extra entires coming from?

Also, do you transform your predicted coordinates or other things from EDM before inputting them into the property classifier and How?

Regards, Yogesh

amorehead commented 1 year ago

I may be wrong here, but I believe 5 is the number of atom types in the GEOM-Drugs dataset (possibly a coincidence, or maybe not). If you are not secretly evaluating your models on the GEOM-Drugs dataset, I'm not sure where this 5 would be coming from.

tuln128 commented 1 year ago

According to the codes for preprocessing input data (the _extractconformer function, https://github.com/ehoogeboom/e3_diffusion_for_molecules/blob/main/build_geom_dataset.py), I have figured out, in this case, that:

Hope it helps.