facebookresearch / beanmachine

A library that allows for inference on probabilistic models
https://beanmachine.org/
MIT License
265 stars 49 forks source link

Segmentation fault in Beanmachine-vectorized implementation of Robust Regresssion #63

Closed SourabhKul closed 4 years ago

SourabhKul commented 4 years ago

Issue Description

When running PPLBench, the Beanmachine-vectorized implementation of Robust Regresssion causes segmentation fault. This occurs in Amazon EC2 Instances

Steps to Reproduce

Linux ip-172-31-10-135 4.15.0-1056-aws #58-Ubuntu SMP Tue Nov 26 15:14:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

pip3 list
alabaster (0.7.12)
appdirs (1.4.3)
arviz (0.5.1)
asn1crypto (0.24.0)
attrs (19.3.0)
Automat (0.6.0)
Babel (2.7.0)
BeanMachine (0.0.1a1, /home/ubuntu/BeanMachine)
black (19.3b0)
blinker (1.4)
certifi (2019.11.28)
cftime (1.0.4.2)
chardet (3.0.4)
Click (7.0)
cloud-init (19.3)
colorama (0.3.7)
command-not-found (0.3)
configobj (5.0.6)
constantly (15.1.0)
coverage (5.0)
cryptography (2.1.4)
cycler (0.10.0)
Cython (0.29.14)
dataclasses (0.7)
distro-info (0.18ubuntu0.18.04.1)
docutils (0.15.2)
ec2-hibinit-agent (1.0.0)
entrypoints (0.3)
flake8 (3.7.9)
h5py (2.10.0)
hibagent (1.0.1)
httplib2 (0.9.2)
hyperlink (17.3.1)
idna (2.8)
imagesize (1.1.0)
importlib-metadata (1.3.0)
incremental (16.10.1)
isort (4.3.21)
Jinja2 (2.10.3)
jsonpatch (1.16)
jsonpointer (1.10)
jsonschema (2.6.0)
keyring (10.6.0)
keyrings.alt (3.0)
kiwisolver (1.1.0)
language-selector (0.1)
MarkupSafe (1.1.1)
matplotlib (3.1.2)
mccabe (0.6.1)
more-itertools (8.0.2)
netCDF4 (1.5.3)
netifaces (0.10.4)
numpy (1.17.4)
oauthlib (2.0.6)
packaging (19.2)
PAM (0.4.2)
pandas (0.25.3)
patsy (0.5.1)
pip (9.0.1)
plotly (4.4.1)
pluggy (0.13.1)
py (1.8.0)
pyasn1 (0.4.2)
pyasn1-modules (0.2.1)
pycodestyle (2.5.0)
pycrypto (2.6.1)
pyflakes (2.1.1)
Pygments (2.5.2)
pygobject (3.26.1)
pyjags (1.2.2)
PyJWT (1.5.3)
pymc3 (3.8)
pyOpenSSL (17.5.0)
pyparsing (2.4.5)
pyserial (3.4)
pystan (2.19.1.1)
pytest (5.3.2)
pytest-cov (2.8.1)
python-apt (1.6.4)
python-dateutil (2.8.1)
python-debian (0.1.32)
pytz (2019.3)
pyxdg (0.25)
PyYAML (3.12)
requests (2.22.0)
requests-unixsocket (0.1.5)
retrying (1.3.3)
scipy (1.4.0)
SecretStorage (2.3.1)
service-identity (16.0.0)
setuptools (42.0.2)
six (1.13.0)
snowballstemmer (2.0.0)
Sphinx (2.3.0)
sphinx-autodoc-typehints (1.10.3)
sphinxcontrib-applehelp (1.0.1)
sphinxcontrib-devhelp (1.0.1)
sphinxcontrib-htmlhelp (1.0.2)
sphinxcontrib-jsmath (1.0.1)
sphinxcontrib-qthelp (1.0.2)
sphinxcontrib-serializinghtml (1.1.3)
ssh-import-id (5.7)
statsmodels (0.10.2)
systemd-python (234)
Theano (1.0.4)
toml (0.10.0)
torch (1.2.0)
tqdm (4.40.2)
Twisted (17.9.0)
ufw (0.36)
unattended-upgrades (0.1)
urllib3 (1.25.7)
wcwidth (0.1.7)
wheel (0.30.0)
xarray (0.14.1)
zipp (0.6.0)
zope.interface (4.3.2)

Expected Behavior

What did you expect to happen?

run this command from pplbench dir: python3 PPLBench.py -m robustRegression -l beanmachine-vectorized -s 500 --trials 1

Expected outcome:

Generating data
Starting benchmark; estimated time is 0 hour(s),3 minutes
Outputs will be saved in : ./outputs/19-12-2019_19:21:56
Segmentation fault (core dumped)

System Info

Please provide information about your setup

Additional Context

For the following system (local server) this segfault does not happen.

Linux nanofabrics-server 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 12:06:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

pip3 list
alabaster (0.7.12)
appdirs (1.4.3)
apturl (0.5.2)
arviz (0.5.1)
asn1crypto (0.24.0)
astroid (2.3.3)
attrs (19.3.0)
Babel (2.7.0)
BeanMachine (0.0.1a1, /home/sourabh/BeanMachine)
black (19.3b0)
Brlapi (0.6.6)
certifi (2018.1.18)
cftime (1.0.4.2)
chardet (3.0.4)
Click (7.0)
command-not-found (0.3)
coverage (5.0)
cryptography (2.1.4)
cupshelpers (1.0)
cycler (0.10.0)
Cython (0.29.14)
dataclasses (0.7)
defer (1.0.6)
distro-info (0.18ubuntu0.18.04.1)
docutils (0.15.2)
flake8 (3.5.0)
h5py (2.10.0)
httplib2 (0.9.2)
idna (2.6)
imagesize (1.1.0)
importlib-metadata (1.3.0)
isort (4.3.21)
Jinja2 (2.10.3)
keyring (10.6.0)
keyrings.alt (3.0)
kiwisolver (1.1.0)
language-selector (0.1)
launchpadlib (1.10.6)
lazr.restfulclient (0.13.5)
lazr.uri (1.0.3)
lazy-object-proxy (1.4.3)
louis (3.5.0)
macaroonbakery (1.1.3)
Mako (1.0.7)
MarkupSafe (1.0)
matplotlib (3.1.2)
mccabe (0.6.1)
more-itertools (8.0.2)
netCDF4 (1.5.3)
netifaces (0.10.4)
numpy (1.17.4)
oauth (1.0.1)
olefile (0.45.1)
opt-einsum (3.1.0)
packaging (19.2)
pandas (0.25.3)
patsy (0.5.1)
pexpect (4.2.1)
Pillow (5.1.0)
pip (9.0.1)
pluggy (0.13.1)
protobuf (3.0.0)
py (1.8.0)
pycairo (1.16.2)
pycodestyle (2.3.1)
pycrypto (2.6.1)
pycups (1.9.73)
pyflakes (1.6.0)
Pygments (2.5.2)
pygobject (3.26.1)
pyjags (1.2.2)
pylint (2.4.4)
pymacaroons (0.13.0)
pymc3 (3.8)
PyNaCl (1.1.2)
pyparsing (2.4.5)
pyRFC3339 (1.0)
pyro-api (0.1.1)
pyro-ppl (1.1.0)
pystan (2.19.1.1)
pytest (5.3.2)
pytest-cov (2.8.1)
python-apt (1.6.4)
python-dateutil (2.8.1)
python-debian (0.1.32)
pytz (2019.3)
pyxdg (0.25)
PyYAML (3.12)
reportlab (3.4.0)
requests (2.18.4)
requests-unixsocket (0.1.5)
scipy (1.3.3)
SecretStorage (2.3.1)
setuptools (42.0.2)
simplejson (3.13.2)
six (1.13.0)
snowballstemmer (2.0.0)
Sphinx (2.3.0)
sphinx-autodoc-typehints (1.10.3)
sphinxcontrib-applehelp (1.0.1)
sphinxcontrib-devhelp (1.0.1)
sphinxcontrib-htmlhelp (1.0.2)
sphinxcontrib-jsmath (1.0.1)
sphinxcontrib-qthelp (1.0.2)
sphinxcontrib-serializinghtml (1.1.3)
ssh-import-id (5.7)
system-service (0.3)
systemd-python (234)
Theano (1.0.4)
toml (0.10.0)
torch (1.3.1)
tqdm (4.40.2)
typed-ast (1.4.0)
ubuntu-drivers-common (0.0.0)
ufw (0.36)
unattended-upgrades (0.1)
urllib3 (1.22)
usb-creator (0.3.3)
wadllib (1.3.2)
wcwidth (0.1.7)
wheel (0.30.0)
wrapt (1.11.2)
xarray (0.14.1)
xkit (0.0.0)
zipp (0.6.0)
zope.interface (4.3.2)
SourabhKul commented 4 years ago

Issue was due to not using torch==1.2.0+cpu closing now.