apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Multinominal distribution returns same results during different runs #14183

Open Ishitori opened 5 years ago

Ishitori commented 5 years ago

Description

If I try to sample from multinominal distribution running the same script multiple times, and not setting the seed, I always get the same result. If I use numpy for similar case, I receive different results for every run.

Created based on this question.

Environment info (Required)

----------Python Info----------
Version      : 3.6.4
Compiler     : GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)
Build        : ('default', 'Jan 16 2018 12:04:33')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 19.0.2
Directory    : /Users/sssokolo/anaconda3/lib/python3.6/site-packages/pip
----------MXNet Info-----------
/Users/sssokolo/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Version      : 1.5.0
Directory    : /Users/sssokolo/anaconda3/lib/python3.6/site-packages/mxnet
Commit Hash   : fd34dc5f847192dfd522555afdf13be1eb67b72b
----------System Info----------
Platform     : Darwin-16.7.0-x86_64-i386-64bit
system       : Darwin
node         : 8c859074eea0
release      : 16.7.0
version      : Darwin Kernel Version 16.7.0: Thu Dec 20 21:53:35 PST 2018; root:xnu-3789.73.31~1/RELEASE_X86_64
----------Hardware Info----------
machine      : x86_64
processor    : i386
b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
b'machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT SGX FPU_CSDS MPX CLFSOPT'
b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz'
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0821 sec, LOAD: 0.9414 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1146 sec, LOAD: 0.9061 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.1225 sec, LOAD: 0.7862 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0840 sec, LOAD: 0.5627 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0746 sec, LOAD: 1.6146 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0777 sec, LOAD: 0.3902 sec.

Package used (Python/R/Scala/Julia): Python

MXNet commit hash: nighly build

Minimum reproducible example

from mxnet import nd

data = nd.array([0.5, 0.5])

for k in range(3):
    a = nd.random.multinomial(data, shape=(5, 1))
    print(a)

Steps to reproduce

  1. Run the code above once and note the output
  2. Now run it few times more - results are exactly the same as before

What have you tried to solve it?

I tried to see if it is really a problem and wrote numpy example. And if I run the example below it returns me different results every time I run it.

import numpy as np
print(np.random.multinomial(1, [0.5, 0.5], size=5))
mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Bug

ChaiBapchya commented 5 years ago

I tried it 6 times and looks like it ain't that bad But certainly need to delve deeper I guess

>>> from mxnet import nd
/Users/chaitanyabapat/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
>>> 
>>> data = nd.array([0.5, 0.5])
>>> 
>>> for k in range(3):
...     a = nd.random.multinomial(data, shape=(5, 1))
...     print(a)
... 

[[1]
 [1]
 [1]
 [1]
 [1]]
<NDArray 5x1 @cpu(0)>

[[1]
 [1]
 [1]
 [0]
 [1]]
<NDArray 5x1 @cpu(0)>

[[1]
 [0]
 [0]
 [0]
 [1]]
<NDArray 5x1 @cpu(0)>
>>> for k in range(3):
...     a = nd.random.multinomial(data, shape=(5, 1))
...     print(a)
... 

[[0]
 [1]
 [0]
 [0]
 [0]]
<NDArray 5x1 @cpu(0)>

[[1]
 [1]
 [1]
 [0]
 [1]]
<NDArray 5x1 @cpu(0)>

[[0]
 [1]
 [1]
 [0]
 [0]]
<NDArray 5x1 @cpu(0)>
szha commented 5 years ago

Related: https://github.com/apache/incubator-mxnet/issues/10369#issue-310541078