dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.55k stars 3.02k forks source link

[Roadmap] v0.3 release checklist #450

Closed jermainewang closed 5 years ago

jermainewang commented 5 years ago

Here is the v0.3 release plan. The tentative release date is 06/07.

[Feature] Kernel support

Kernels are critical for our system performance. The next release will include all the basic building blocks and APIs for future extensions.

[Feature] Giant graph support

Release a demo about how to train GNNs on giant graphs that cannot be hosted in a single GPU memory. This includes:

[Enhancement] NN module

[Enhancement] graph structure

Model Examples

Bug fix

Features postponed to v0.4

[Feature] DGL graph data format As we want to include more popular graph dataset in DGL, it is time to decouple dataset with DGL repo.

[Feature] Bipartite (k-partite) graph API Bipartite graph is popular in recommendation setting. We heard many requests for this.

[Enhancement] Improve IR & scheduling system

yzh119 commented 5 years ago

Some additional items:

yzh119 commented 5 years ago

For API, I suggest providing more flexible message function to handle different type of edges. For example, In tree-lstm we are supposed to send different messages for left branch and right branch.

aksnzhy commented 5 years ago

We plan to implement a light-weight kvstore instead of using MXNet KVStore for DGL. Do we consider add this item to 0.3 version?

zheng-da commented 5 years ago

If time permits, we might add GPU samplers to accelerate mini-batch training in a single GPU.

zheng-da commented 5 years ago

We plan to implement a light-weight kvstore instead of using MXNet KVStore for DGL. Do we consider add this item to 0.3 version?

0.3 version will be a fast release. How about we plan it for 0.4 release?

aksnzhy commented 5 years ago

We plan to implement a light-weight kvstore instead of using MXNet KVStore for DGL. Do we consider add this item to 0.3 version?

0.3 version will be a fast release. How about we plan it for 0.4 release?

No problem.

qzshadow commented 5 years ago

Looking forward to supports for giant graphs!

alexvpickering commented 5 years ago

A few models I'd like to see:

jermainewang commented 5 years ago

Hi @alexvpickering , will look into it. One question, do you expect to see them as layers in NN packages or complete examples with data loading/training/testing ?

alexvpickering commented 5 years ago

Thanks @jermainewang! I was thinking examples like found in e.g. dgl/examples/pytorch/sgc.

ghost commented 5 years ago

will some model in Euler like LsHNE, LasGNN, ScalableGCN implement in the feature? These algorithm are like Pinsage Application in production environment and support Heterogeneous Network.

zheng-da commented 5 years ago

@HuangZhanPeng We will work on heterogeneous graphs in the 0.4 release. We'll evaluate these models.

gasteigerjo commented 5 years ago

It would be great if you could also implement APPNP! It's quite simple and performed best in PyTorch Geometric's benchmark, so people could clearly benefit here as well. :)

mufeili commented 5 years ago

Thank you for the suggestion @klicperajo . I think we've already had one implementation for this work :), though we may need to tune it more carefully to achieve a better performance. See dgl APPNP example.

gasteigerjo commented 5 years ago

Oh, nice! I didn't notice since it's not in the summary table. Yes, there are a couple of details to get the last tens of percent accuracy. Should be all in the paper, though. :)

jermainewang commented 5 years ago

Just updated the roadmap with a checklist. Our tentative date for this release is 06/07.

For all committers @zheng-da @szha @BarclayII @VoVAllen @ylfdq1118 @yzh119 @GaiYu0 @mufeili @aksnzhy @zzhang-cn @ZiyueHuang , please vote with :+1: if you agree with this plan.

szha commented 5 years ago

did you mean 6/7?

VoVAllen commented 5 years ago

@yzh119 will be in charge of edge/node removal

mufeili commented 5 years ago

@jermainewang What do you need for issues under NN module?

yzh119 commented 5 years ago

One question: do we need to rebuild node/edge index when calling node/edge removal? For example:

import dgl
import torch as th
g = dgl.DGLGraph()
g.add_nodes(5)
g.ndatas['x'] = th.rand(5, 3)
g.del_nodes([2, 3])
print(g.nodes())

What should be the output? tensor([0, 1, 4])?

yzh119 commented 5 years ago

I must say, if the node index and edge index have to be rebuilt after node/edge removal, the behavior of these operations would be VERY CONFUSING. I can't see any scenario that these operations would make any sense.

VoVAllen commented 5 years ago

I think we need to return the removed edges, and let user handle the mapping between original index and modified index. (or provide utils function)

yzh119 commented 5 years ago

@VoVAllen , I don't see any benefits of doing this compared to creating a subgraph of the current graph.

jermainewang commented 5 years ago

@jermainewang What do you need for issues under NN module?

(1) EdgeSoftmax needs to be re-implemented using the new builtins; (2) GAT as an NN module. Both depend on the kernel branch to be fully merged.

@mufeili do you have time take over these two items?

mufeili commented 5 years ago

@jermainewang I may need to wait till weekend for starting implementation. If that's good for you, I'll take over them.

jermainewang commented 5 years ago

I've taken to_simple_graph. The rest is yours.

jermainewang commented 5 years ago

v0.3 has been released. Thanks everyone!