Closed xptree closed 5 years ago
Hi, glad to hear that you would like to support DGL. We have a plan for graph pooling that supports set2set, you could find our PR here.
@yzh119 Super! I have added the PR link to our release plan. We can help check if the DGL set2set performance is comparable to the PyG implementation. BTW, when do you plan to release the set2set implementation?
@xptree We are trying to complete this PR within this week. But it would take longer(about two months) to release DGL 0.4 because we have many other features to support (Heterograph, Knowledge Graph, etc.). Before that, user could build DGL from source to use pooling module.
@yzh119 Got it. We will check DGL's set2set after the PR is merged. @GuangyongChen Since DGL 0.4 will be released after Alchemy's phase two, we may have to ask contest participants to compile DGL from source rather than install from pip/conda.
@GuangyongChen As a complete benchmark, shall we also upload other models that we have implemented? I mean models we reported in our arXiv paper.
you can see at this repository of me. https://github.com/yvquanli/smiles_mol_dataset_in_dgl_styles
@yvquanli Thanks! Will definitely refer to. We also put our version at https://github.com/tencent-alchemy/Alchemy/tree/dgl.
In case you are interested, here is another code example for converting smiles to DGLGraph
: https://gist.github.com/mufeili/8eb0c1cdf23604e7da7445c49a33676a. Comparing to @yvquanli 's great example, this one does not need networkx for intermediate.
you can see at this repository of me. https://github.com/yvquanli/smiles_mol_dataset_in_dgl_styles
Thank you for contribution. BTW, I found you have post a multiprocessing dataloader. Could @Moxinlin please theck the following code and update it?
def _single_sdf_graph_reader(self, sdf_file):
result = self.sdf_graph_reader(sdf_file)
if result is None:
return (None, None)
return result
def _load(self):
if self.mode == 'dev':
target_file = pathlib.Path(self.file_dir, "train.csv")
self.target = pd.read_csv(target_file,
index_col=0,
usecols=[
'gdb_idx',
] +
['property_%d' % x for x in range(12)])
self.target = self.target[['property_%d' % x for x in range(12)]]
sdf_dir = pathlib.Path(self.file_dir, "sdf")
self.graphs, self.labels = [], []
# cnt = 0
# from tqdm import tqdm
# for sdf_file in tqdm(sdf_dir.glob("**/*.sdf")):
# result = self.sdf_graph_reader(sdf_file)
# if result is None:
# continue
# cnt += 1
# self.graphs.append(result[0])
# self.labels.append(result[1])
from multiprocessing import Pool
with Pool() as pool:
results = pool.map(self._single_sdf_graph_reader, sdf_dir.glob("**/*.sdf"))
self.graphs, self.labels = list(zip(*results))
self.normalize()
print(len(self.graphs), "loaded!")
@GuangyongChen would you mind completing the README? Add document about DGL.
@xptree No problem, we will revise the README, which can be done by next week.
Hi @xptree , @geekinglcq just revised the README file to add some document about DGL, which can be checked for your reference.
@GuangyongChen It seems that we have completed all features described in this issue. Shall we close this issue?
@xptree Thanks for your reminder, I will close this issue.
We should support DGL, which is another GNN library. Here is the release plan. The tentative release date is 07/31/2019, before the contest enters phase two.
[Feature] DGL dataloader
Release a DGL dataloader for reading the Alchemy dataset.
[Feature] DGL MPNN model
Release a DGL MPNN model as what we have done for PyG. MPNN model is GGNN + set2set, SchNet and MGCN are also included: