[Feature Request] Add missing functions in numpy backend.

zengxy commented 5 years ago

🚀 Feature

There are some missing or incompatible functions compared with the interface in numpy backend. The list is following.

Missing

[ ] device_type
[ ] mean
[ ] unsorted_1d_segment_sum
[ ] unsorted_1d_segment_mean
[ ] zerocopy_to_dgl_ndarray
[ ] logical_not
[ ] zerocopy_to_dlpack
[ ] equal
[ ] ndim
[ ] sync
[ ] zerocopy_from_dlpack
[ ] copy_reduce
[ ] zerocopy_from_dgl_ndarray
[ ] boolean_mask
[ ] narrow_row
[ ] stack
[ ] binary_reduce
[ ] device_id
[ ] zeros_like

Not compatible

[ ] zeros (shape, dtype) v.s. (shape, dtype, ctx)
[ ] ones (shape, dtype)v.s. (shape, dtype, ctx)
[ ] full_1d (length, fill_value) v.s. (length, fill_value, dtype, ctx)

Motivation

In some scenarios, we just want to use message propogation in dgl (e.g. LPA, PageRank) and thus choose numpy as backend. However due to the missing or incompatible functions in numpy backend, even the simplest example send(message_func=dgl.function.copy_src("f", "m")) will raise an exception.

Do you think it's necessary to completing the missing functions? And if so I can help to add some within my ability.

yzh119 commented 5 years ago

It seems even in the scenarios you mentioned, uses can still use PyTorch as backend, except for the case where some operations in numpy are not supported by PyTorch. @jermainewang what do you think about it?

zheng-da commented 5 years ago

We have plans to add the MXNet Numpy backend, which provides compatible Numpy APIs with multi-core and GPU supports as well as backward computation support. Once MXNet Numpy is ready, we'll add it to DGL. Please see the progress of MXNet Numpy: https://github.com/apache/incubator-mxnet/issues/14327

In addition, a few more operators are required by DGL.

[ ] unique
[ ] nonzero
[ ] boolean_mask
[ ] index_copy
[ ] segment_sum

zengxy commented 5 years ago

Thanks for reply !

@yzh119 It's actually true that we can use the DL backend to finish it. However lots of original data (features or edges) is stored or accessed as numpy format. And one should still convert it to DL backend Tensor (e.g f = torch.Tensor(numpy_f)) when just message passing is needed. It's a little redundant for me.

@zheng-da Do you mean numpy will be replaced by someting like mxnet.numpy and one can still use it directly by writing numpy API while the backned is actually mxnet?

jermainewang commented 4 years ago

Due to the lack of bandwidth, we are not likely to push this feature in the near future. I've labeled this with help wanted and any community help is highly welcomed!

dmlc / dgl