DeepGraphLearning / torchdrug

A powerful and flexible machine learning platform for drug discovery
https://torchdrug.ai/
Apache License 2.0
1.43k stars 199 forks source link

atom_feature="symbol" is only available for class GCPNGeneration() ? #240

Open teltim opened 10 months ago

teltim commented 10 months ago
     ...
    @torch.no_grad()
    def generate(self, num_sample, max_resample=20, off_policy=False, max_step=30 * 2, initial_smiles="C", verbose=0):
        is_training = self.training
        self.eval()
        graph = data.Molecule.from_smiles(initial_smiles, kekulize=True, atom_feature="symbol").repeat(num_sample)
     ...

Is there any use for limiting the graph features during generation to symbols? Does this mean that generation will not work for anything other than symbols?

Actually, reonforcement_learning can finish in less than a long time for "symbols." But, it can spend long time with, for instance, "explicit_property_prediction". I do not know the reason...

chrisvdwerf commented 6 months ago

I noticed that, when splitting up finalized molecular graphs into subgraphs, there is no refeaturization in the torchdrug implementation of GCPN.

For some features this can be troublesome, e.g. number of implicit hydrogen atoms, as these node features depend on neighbourhood of a node.

As such, I think that you should be careful about which features you include in the node embeddings.