dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.58k stars 3.02k forks source link

[Roadmap] v0.4 release tracker #666

Closed jermainewang closed 5 years ago

jermainewang commented 5 years ago

Tentative release date: 09/30

[Feature] Heterogenous graph

This has been a high-demanding feature since the birth of DGL. It is finally the time to push for this. v0.4 will be majorly about this support, this includes but not limited to:

Tracker

[Feature] Global pooling module (Done in v0.3.1)

Our current graph pooling (readout) support is limited, with only basic sum/max readout operation. In v0.4, we want to enrich this part.

[Feature] Enrich NN modules (Mostly done in v0.3.1)

Tracker

@yzh119 please update.

[Feature] Unified graph data format

The idea is to define our own data storage format and provide easy utilities to convert, load, save to/from such format. (RFC #758 )

Tracker:

[Application] Knowledge base embedding

Tracker:

Other

Postpone to v0.5

mufeili commented 5 years ago

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

jermainewang commented 5 years ago

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

I think we will focus on DL-based pooling methods in this release. For KNN and spectral, I would suggest converting our graph to numpy/scipy and use sklearn. If the conversion could be handled carefully (probably with zero-copy support), it should be very efficient.

yzh119 commented 5 years ago

Should GraphSage also be included in NN modules? And Set Transformer, is also a kind of graph pooling mechanism, if we have time we could try this.

aksnzhy commented 5 years ago

The CPU-based kvstore can be released in 0.4. The GPU-direct kvstore could be in the next cycle.

tbright17 commented 5 years ago

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

HQ01 commented 5 years ago

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

tbright17 commented 5 years ago

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

Wow the gap is really big...

mufeili commented 5 years ago

Depending on our bandwidth, we may want to add examples for three important applications:

  1. Molecule Property Prediction: Molecular graphs are probably among the most important applications for small graphs. For this area, Neural Message Passing for Quantum Chemistry can be a good example candidate. During our discussion with Tencent Alchemy team, this model has achieved the best performance among previous work on the quantum chemistry tasks they are interested. It has also been previously mentioned in the discussion forum here. I will take this.
  2. Point Cloud: An important topic for constructing graphs over non-graph data and bridging graph computing with CV and graphics, as mentioned in Issue # 719.
  3. Geometry/3D data: The latest wave of deep learning on graphs has a strong correlation with geometric data and can be collectively considered as geometric deep learning. There can be a high interest of applying graph neural networks to more general geometric data, as mentioned in a discussion thread before.
jermainewang commented 5 years ago

Changed the draft to a progress tracker. The target release date is 09/30.

For all committers @zheng-da @szha @BarclayII @VoVAllen @ylfdq1118 @yzh119 @GaiYu0 @mufeili @aksnzhy @zzhang-cn @ZiyueHuang , please vote with +1 if you agree with this plan.

aksnzhy commented 5 years ago

@jermainewang Actually the kvstore has been finished and we have already finish a demo to training distributed DistMult on FB15k data. If we should release this demo on 0.4?

jermainewang commented 5 years ago

@jermainewang Actually the kvstore has been finished and we have already finish a demo to training distributed DistMult on FB15k data. If we should release this demo on 0.4?

Yes. Let's push for the feature, but it's OK if we think it needs more time to polish and we could highlight it in v0.5.

jermainewang commented 5 years ago

v0.4 has been released. Thanks everyone for the support.