dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.56k stars 538 forks source link

BeamSearchSampler failing with mx.numpy input #1220

Closed pasmargo closed 4 years ago

pasmargo commented 4 years ago

Description

I follow the instructions to generate sequences with Beam Search and it works correctly when the input is an mx.nd object. However, I get an error message when the input is an mx.numpy object.

With the original line, it works correctly:

inputs = mx.nd.full(shape=(1,), ctx=ctx, val=bos_ids[-1])

With this other line, it makes the sampler fail:

inputs = mx.numpy.full((1,), bos_ids[-1], ctx=ctx)

Error Message

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-711d82cb4e91> in <module>()
----> 1 generate_sequences(beam_sampler, inputs, begin_states, 5)

<ipython-input-17-c4c52c7fd9b6> in generate_sequences(sampler, inputs, begin_states, num_print_outcomes)
      1 def generate_sequences(sampler, inputs, begin_states, num_print_outcomes):
      2 
----> 3     samples, scores, valid_lengths = sampler(inputs, begin_states)
      4     samples = samples[0].asnumpy()
      5     scores = scores[0].asnumpy()

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonnlp/model/sequence_sampler.py in __call__(self, inputs, states)
    525                                       state_info=state_info)
    526         step_input = _expand_to_beam_size(inputs, beam_size=beam_size,
--> 527                                           batch_size=batch_size).astype(np.int32)
    528         # All beams are initialized to alive
    529         # Generated samples are initialized to be the inputs

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonnlp/model/sequence_sampler.py in _expand_to_beam_size(data, beam_size, batch_size, state_info)
    199         new_shape[batch_axis] = batch_size * beam_size
    200         new_shape = tuple(new_shape)
--> 201         return data.expand_dims(batch_axis+1)\
    202                    .broadcast_axes(axis=batch_axis+1, size=beam_size)\
    203                    .reshape(new_shape)

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/numpy/multiarray.py in expand_dims(self, *args, **kwargs)
   1435         this array as data.
   1436         """
-> 1437         raise AttributeError('mxnet.numpy.ndarray object has no attribute expand_dims')
   1438 
   1439     def tile(self, *args, **kwargs):

AttributeError: mxnet.numpy.ndarray object has no attribute expand_dims

To Reproduce

Use the code for the sequence generation using Beam Search in this link:

https://gluon-nlp.mxnet.io/examples/sequence_sampling/sequence_sampling.html

And substitute the line

inputs = mx.nd.full(shape=(1,), ctx=ctx, val=bos_ids[-1])

with

inputs = mx.numpy.full((1,), bos_ids[-1], ctx=ctx)

What have you tried to solve it?

  1. I tried to convert the mx.numpy input to mx.nd with .as_nd_ndarray(). It gets passed that error, but then Gluon Blocks requires all outputs to be mx.numpy and it fails in an old reshape method (mx.numpy should not have a named shape argument):
~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/gluonnlp/model/sequence_sampler.py in hybrid_forward(self, F, samples, valid_length, outputs, scores, beam_alive_mask, states)
    418         beam_size = self._beam_size
    419         # outputs: (batch_size, beam_size, vocab_size)
--> 420         outputs = outputs.reshape(shape=(-4, -1, beam_size, 0))
    421         if self._top_k:
    422             ranks = outputs.argsort(is_ascend=False, dtype='int32')

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

paste outputs here

----------Python Info----------
Version      : 3.6.5
Compiler     : GCC 7.2.0
Build        : ('default', 'Apr 29 2018 16:14:56')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 10.0.1
Directory    : /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version      : 1.6.0
Directory    : /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet
Num GPUs     : 1
Commit Hash   : 6eec9da55c5096079355d1f1a5fa58dcf35d6752
----------System Info----------
Platform     : Linux-4.14.171-105.231.amzn1.x86_64-x86_64-with-glibc2.9
system       : Linux
node         : ip-172-16-83-169
release      : 4.14.171-105.231.amzn1.x86_64
version      : #1 SMP Thu Feb 27 23:49:15 UTC 2020
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2701.008
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.14
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0022 sec, LOAD: 0.4192 sec.
Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0005 sec, LOAD: 0.3731 sec.
Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0927 sec, LOAD: 0.0981 sec.
Timing for D2L: http://d2l.ai, DNS: 0.0303 sec, LOAD: 0.0695 sec.
Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0307 sec, LOAD: 0.1809 sec.
Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0259 sec, LOAD: 0.3750 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0034 sec, LOAD: 0.0990 sec.
Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.003132343292236328 sec.
leezu commented 4 years ago

We'll have a new release of gluonnlp to support the numpy interface. For now just convert the array to nd before calling the beamsearchsample

cc @sxjscience

sxjscience commented 4 years ago

For the numpy version, you may try to use the version that is initiated in https://github.com/dmlc/gluon-nlp/pull/1225