nyu-dl / dl4chem-mgm

BSD 3-Clause "New" or "Revised" License
69 stars 10 forks source link

SMILES vs. Graphs #6

Closed hossein-amirkhani closed 2 years ago

hossein-amirkhani commented 2 years ago

Even though the goal of the presented method is to generate graphs, I noticed in the code that you finally convert it to SMILES. According to your paper, it is because "the GuacaMol benchmark requires that graph representations be converted into SMILES strings before evaluation." I have two questions, which I really appreciate if you answer:

Do the generated graphs (before converting to SMILES) have properties not available in the converted SMILES? Or the two representations convey the same information? It seems that both datasets used to pre-train the generator include SMILES strings. So, how you can use SMILES strings to generate a higher-level representation as graphs? I assume that graphs include more information than SMILES.

omarnmahmood commented 2 years ago

Currently the two representations do contain the same information. In principle graph-based representations can carry other types of information e.g. 3d coordinates that you could compute and add to the existing representations but we have not implemented that so far.

hossein-amirkhani commented 2 years ago

Thanks