apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Documentation bug: `dgl_csr_neighbor_non_uniform_sample` doesn't have parameter `seed_arrays` #18989

Open DNXie opened 4 years ago

DNXie commented 4 years ago

Description

In the documentation of these 8 APIs listed below, there is a parameter seed_arrays listed in Parameter section. But it is not in the signature and not accepted by the functions, and passing them gives abort, which crashes the program.

Error Message

(Paste the complete error message. Please also include stack trace by setting environment variable DMLC_LOG_STACK_TRACE_DEPTH=10 before running your script.)

terminate called after throwing an instance of 'dmlc::ParamError'
  what():  Cannot find argument 'seed_arrays', Possible Arguments:
----------------
num_args : int, required
    Number of input NDArray.
num_hops : long, optional, default=1
    Number of hops.
num_neighbor : long, optional, default=2
    Number of neighbor.
max_num_vertices : long, optional, default=100
    Max number of vertices.
, in operator _contrib_dgl_csr_neighbor_non_uniform_sample(name="", max_num_vertices="5", num_hops="1", num_args="3", num_neighbor="2", seed_arrays="
[0 1 2 3 4]
<NDArray 5 @cpu(0)>", probability="
[0.9 0.8 0.2 0.4 0.1]
<NDArray 5 @cpu(0)>", csr_matrix="
<CSRNDArray 5x5 @cpu(0)>")
Aborted (core dumped)

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

This is the exact example provided in the documentation.

import mxnet as mx
import numpy as np
shape = (5, 5)
prob = mx.nd.array([0.9, 0.8, 0.2, 0.4, 0.1], dtype=np.float32)
data_np = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20], dtype=np.int64)
indices_np = np.array([1,2,3,4,0,2,3,4,0,1,3,4,0,1,2,4,0,1,2,3], dtype=np.int64)
indptr_np = np.array([0,4,8,12,16,20], dtype=np.int64)
a = mx.nd.sparse.csr_matrix((data_np, indices_np, indptr_np), shape=shape)
seed = mx.nd.array([0,1,2,3,4], dtype=np.int64)
out = mx.nd.contrib.dgl_csr_neighbor_non_uniform_sample(csr_matrix=a, probability=prob, seed_arrays=seed, num_args=3, num_hops=1, num_neighbor=2, max_num_vertices=5)

The same error when calling the rest 7 APIs.

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

# paste outputs here

OS: ubuntu 18.04 Python: 3.7.6 pip: 20.0.2 numpy: 1.18.5 mxnet: 1.6.0

szha commented 4 years ago

cc @lingfanyu

jermainewang commented 4 years ago

cc'ed @zheng-da here. All these APIs should have been supported in DGL. We could have a discussion on whether to continue support them in MXNet or deprecate them.

szha commented 4 years ago

@jermainewang it's ok to deprecate them in 2.0. for 1.x since it's already in the code base, we will need to fix the doc.

hadevin commented 3 years ago

I found this issue on ovio.org and would love to contribute! I've never contributed before.

szha commented 3 years ago

@hadevin thanks for offering to help. The contribution guide for our project can be found at: https://mxnet.apache.org/versions/master/community#contribution-guides

For this issue, I think we need to add a note on the documentation that the expected number of arrays for seed_arrays is the num_args minus two (csr matrix and probability).

The operators with documentation problems are in https://github.com/apache/incubator-mxnet/blob/v1.x/src/operator/contrib/dgl_graph.cc. For example: https://github.com/apache/incubator-mxnet/blob/787416b23a1d5730fce995a6b662a51cee45e20f/src/operator/contrib/dgl_graph.cc#L866-L935

Feel free to ask here if you have any question.