tencent-alchemy / Alchemy

https://alchemy.tencent.com/
MIT License
115 stars 38 forks source link

[Roadmap] DGL support checklist #5

Closed xptree closed 5 years ago

xptree commented 5 years ago

We should support DGL, which is another GNN library. Here is the release plan. The tentative release date is 07/31/2019, before the contest enters phase two.

[Feature] DGL dataloader

Release a DGL dataloader for reading the Alchemy dataset.

[Feature] DGL MPNN model

Release a DGL MPNN model as what we have done for PyG. MPNN model is GGNN + set2set, SchNet and MGCN are also included:

yzh119 commented 5 years ago

Hi, glad to hear that you would like to support DGL. We have a plan for graph pooling that supports set2set, you could find our PR here.

xptree commented 5 years ago

@yzh119 Super! I have added the PR link to our release plan. We can help check if the DGL set2set performance is comparable to the PyG implementation. BTW, when do you plan to release the set2set implementation?

yzh119 commented 5 years ago

@xptree We are trying to complete this PR within this week. But it would take longer(about two months) to release DGL 0.4 because we have many other features to support (Heterograph, Knowledge Graph, etc.). Before that, user could build DGL from source to use pooling module.

xptree commented 5 years ago

@yzh119 Got it. We will check DGL's set2set after the PR is merged. @GuangyongChen Since DGL 0.4 will be released after Alchemy's phase two, we may have to ask contest participants to compile DGL from source rather than install from pip/conda.

xptree commented 5 years ago

@GuangyongChen As a complete benchmark, shall we also upload other models that we have implemented? I mean models we reported in our arXiv paper.

yvquanli commented 5 years ago

you can see at this repository of me. https://github.com/yvquanli/smiles_mol_dataset_in_dgl_styles

xptree commented 5 years ago

@yvquanli Thanks! Will definitely refer to. We also put our version at https://github.com/tencent-alchemy/Alchemy/tree/dgl.

mufeili commented 5 years ago

In case you are interested, here is another code example for converting smiles to DGLGraph: https://gist.github.com/mufeili/8eb0c1cdf23604e7da7445c49a33676a. Comparing to @yvquanli 's great example, this one does not need networkx for intermediate.

geekinglcq commented 5 years ago

you can see at this repository of me. https://github.com/yvquanli/smiles_mol_dataset_in_dgl_styles

Thank you for contribution. BTW, I found you have post a multiprocessing dataloader. Could @Moxinlin please theck the following code and update it?

    def _single_sdf_graph_reader(self, sdf_file):
        result = self.sdf_graph_reader(sdf_file)
        if result is None:
            return (None, None)
        return result

    def _load(self):
        if self.mode == 'dev':
            target_file = pathlib.Path(self.file_dir, "train.csv")
            self.target = pd.read_csv(target_file,
                                      index_col=0,
                                      usecols=[
                                                  'gdb_idx',
                                              ] +
                                              ['property_%d' % x for x in range(12)])
            self.target = self.target[['property_%d' % x for x in range(12)]]

        sdf_dir = pathlib.Path(self.file_dir, "sdf")
        self.graphs, self.labels = [], []
        #         cnt = 0
        #         from tqdm import tqdm
        #         for sdf_file in tqdm(sdf_dir.glob("**/*.sdf")):
        #             result = self.sdf_graph_reader(sdf_file)
        #             if result is None:
        #                 continue
        #             cnt += 1
        #             self.graphs.append(result[0])
        #             self.labels.append(result[1])
        from multiprocessing import Pool
        with Pool() as pool:
            results = pool.map(self._single_sdf_graph_reader, sdf_dir.glob("**/*.sdf"))

        self.graphs, self.labels = list(zip(*results))
        self.normalize()
        print(len(self.graphs), "loaded!")
xptree commented 5 years ago

@GuangyongChen would you mind completing the README? Add document about DGL.

GuangyongChen commented 5 years ago

@xptree No problem, we will revise the README, which can be done by next week.

GuangyongChen commented 5 years ago

Hi @xptree , @geekinglcq just revised the README file to add some document about DGL, which can be checked for your reference.

xptree commented 5 years ago

@GuangyongChen It seems that we have completed all features described in this issue. Shall we close this issue?

GuangyongChen commented 5 years ago

@xptree Thanks for your reminder, I will close this issue.