apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.76k stars 6.8k forks source link

armhf: virtual memory exhausted: Cannot allocate memory #18869

Open amir-saniyan opened 4 years ago

amir-saniyan commented 4 years ago

Description

During compilation of MXNet 1.6.0 on Debian 10.5.0 (armhf) to use in Raspberry Pi, compilation failed.

Error Message

When the build system tries to compile the np_einsum_op.cc, the following error occurred:

virtual memory exhausted: Cannot allocate memory
ninja: build stopped: subcommand failed.

To Reproduce

$ tar -xvf apache-mxnet-src-1.6.0-incubating.tar.gz
$ cd apache-mxnet-src-1.6.0-incubating
$ mkdir build
$ cd build
$ cmake \
    -DSUPPORT_F16C=OFF \
    -DUSE_SSE=OFF \
    -DUSE_CUDA=OFF \
    -DUSE_OPENCV=OFF \
    -DUSE_OPENMP=ON \
    -DUSE_MKL_IF_AVAILABLE=OFF \
    -DUSE_SIGNAL_HANDLER=ON \
    -DCMAKE_BUILD_TYPE=Release \
    -G Ninja ..
$ ninja

What have you tried to solve it?

I am not running out of memory. The memory is 4 GB and swap is 2 GB.

I am using a 32-bit GCC (Debian, armhf). It only has 3 GiB available for userspace addresses, and if a single process needs more than that, the error occurred.

I edited the build/build.ninja and changed -O3 to -O1 for FLAGS of np_einsum_op.cc. After that change, the compilation was success.

Please split large source files (just like np_einsum_op.cc) to smaller source codes to prevent this issue. 32-bit compilers on Linux does not access more than 3 GB of memory even total memory is too big.

github-actions[bot] commented 4 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

amir-saniyan commented 4 years ago

The following paragraph is not always true:

https://mxnet.apache.org/versions/1.6/get_started?platform=devices&iot=raspberry-pi:

If you are getting build errors in which the compiler is being killed, it is likely that the compiler is running out of memory (especially if you are on Raspberry Pi 1, 2 or Zero, which have less than 1GB of RAM), this can often be rectified by increasing the swapfile size on the Pi by editing the file /etc/dphys-swapfile and changing the line CONF_SWAPSIZE=100 to CONF_SWAPSIZE=1024, then running:

sudo /etc/init.d/dphys-swapfile stop sudo /etc/init.d/dphys-swapfile start free -m # to verify the swapfile size has been increased

Compilers on 32-bit Linux does not access more than 3 GB of memory even total memory is too big.

leezu commented 4 years ago

@amir-saniyan

Please split large source files (just like np_einsum_op.cc) to smaller source codes to prevent this issue. 32-bit compilers on Linux does not access more than 3 GB of memory even total memory is too big.

This would be helpful indeed. Would you be able to submit a PR?

marcoabreu commented 4 years ago

Generally we rather recommend to crosscompile instead of compiling directly on a raspberry: https://mxnet.apache.org/versions/1.6/get_started?platform=devices&iot=raspberry-pi&