Closed alopez1327 closed 4 years ago
Tried to fix the error by adding: apt update && apt-key update && apt install -y --no-install-recommends
to deb.ubuntu.ccache.sh before autoconf is installed. That sort of worked but now it seems that asciidoc was not installed because:
/usr/bin/gcc -std=gnu99 -g -O2 -Wall -W -Werror -o ccache src/main.o src/args.o src/ccache.o src/cleanup.o src/compopt.o src/conf.o src/counters.o src/execute.o src/exitfn.o src/hash.o src/hashutil.o src/language.o src/lockfile.o src/manifest.o src/mdfour.o src/stats.o src/unify.o src/util.o src/version.o src/getopt_long.o src/hashtable.o src/hashtable_itr.o src/murmurhashneutral2.o src/snprintf.o -lm -lz
I believe ci/build.py
is not suppported outside of the CI environment. So I'm closing the issue. Feel free to reopen if you believe it is / should be supported.
Do you have any issues with cross-compile?
Its just that from some of the discussions in the forum it was suggested to use this CI capability to build a library for the raspberry Pi. Cross compilation will take me a bit longer to do and since currently the documentation is missing from the master branch I was not sure which was the sanctioned way of generating the library for the Pi.
I did try to use the precompiled version (mxnet-1.5.0-py2.py3-none-any.whl) of the library but it does not support OpenCV.
Ill try cross-compilation then.
Thanks!
@alopez1327 can you link the relevant discussion? Maybe @marcoabreu can clarify how to use the Docker container build setup
We used to show build.py as preferred build method on the devices-installation-dialog, but now it seems like it's gone.
Generally, the script should also work locally since it's dockerize. Especially considering that we're actively running it in CI, the error seems worth investigating. One thing I'm thinking of is that we're not running into it due to our caching, but once our cache breaks, we'll face the same error. Thus, I'd recommend to look into this error since otherwise it could knock off CI if the cache gets invalidated.
@zachgk assign @larroy
@larroy Please comment if you want to take this issue.
I'm quite short with time but I can try to chime in.
I built the latest from master without any problems, did you update submodules?
time ci/build.py -p armv7
...
2019-11-21 22:04:06,620 - root - INFO - Waiting for status of container 8ff0cb9f431b for 600 s.
2019-11-21 22:04:07,090 - root - INFO - Container exit status: {'Error': None, 'StatusCode': 0}
2019-11-21 22:04:07,091 - root - INFO - Container exited with success 👍
2019-11-21 22:04:07,091 - root - INFO - Stopping container: 8ff0cb9f431b
2019-11-21 22:04:07,092 - root - INFO - Removing container: 8ff0cb9f431b
0.71user 0.22system 3:34.27elapsed 0%CPU (0avgtext+0avgdata 65920maxresident)
I checked out master and updated the submodules, and encountered the same error originally reported by alopez1327. My steps:
git clone https://github.com/apache/incubator-mxnet
cd incubator-mxnet/
git submodule update --init --recursive
time ci/build.py -p armv7
...
The following packages will be upgraded:
autoconf
1 upgraded, 1 newly installed, 0 to remove and 84 not upgraded.
Need to get 1169 kB of archives.
After this operation, 2502 kB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
autoconf
E: There are problems and -y was used without --force-yes
The command '/bin/sh -c /work/deb_ubuntu_ccache.sh' returned a non-zero code: 100
Traceback (most recent call last):
File "ci/build.py", line 454, in <module>
sys.exit(main())
File "ci/build.py", line 364, in main
num_retries=args.docker_build_retries, no_cache=args.no_cache)
File "ci/build.py", line 116, in build_docker
run_cmd()
File "/Users/[NAME]/dev/incubator-mxnet/ci/util.py", line 81, in f_retry
return f(*args, **kwargs)
File "ci/build.py", line 114, in run_cmd
check_call(cmd)
File "/Users/[NAME]/anaconda3/envs/[ENV]/lib/python3.7/subprocess.py", line 347, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'build', '-f', 'docker/Dockerfile.build.armv7', '--build-arg', 'USER_ID=501', '--build-arg', 'GROUP_ID=20', '--cache-from', 'mxnetci/build.armv7', '-t', 'mxnetci/build.armv7', 'docker']' returned non-zero exit status 100.
395.68 real 1.28 user 0.96 sys
This also affects the CI.
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fedge/detail/PR-17031/13/pipeline/
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fedge/detail/PR-17031/14/pipeline/
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fedge/detail/PR-17031/15/pipeline/
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fedge/detail/PR-17031/16/pipeline/
My PR updates ubuntu_arm.sh
script (https://github.com/apache/incubator-mxnet/pull/17031/commits/dc4e18a61d6c30b80f4265686f7f740aa8182973), thus the cached Docker container can't be reused and is rebuilt. The build fails due to this issue.
@larroy the reason that ci/build.py -p armv7
works is that it uses the cache.
Would there be any problem to switch to the updated dockcross
images? They recently moved to debian stretch: https://github.com/dockcross/dockcross/commit/aae501313ef206b769b0204067cebd1604029fcd
Where is mxnetcipinned/dockcross-linux-X
maintained? I couldn't find it at https://github.com/apache/incubator-mxnet-ci/search?q=mxnetcipinned&unscoped_q=mxnetcipinned
It may be worth considering to directly track dockcross
. Thereby we can avoid running into the case where our CI only continues to work due to a cache..
when we tracked dockcross directly it was breaking CI randomly. That's why it's pinned. Anyone can update this via PR.
This is correct, to reproduce container rebuild:
piotr@44-229-42-241:0: ~/mxnet [upstream_master]> time ci/build.py -p armv7 --no-cache 2>&1 | tee armv7.log
As part of https://github.com/apache/incubator-mxnet/issues/16753, the docker containers have been updated and switched to the upstream version. Thus this issue can be closed.
I opened https://github.com/apache/incubator-mxnet/issues/17151 to track pinning the upstream containers again @larroy
Description
Tried building library using docker through the command: python3 ci/build.py -p armv7 but compilation failed because it seems docker is pulling some binaries from and old repository (jessie). Tried running the same code with --force-yes with the same result.
Also tried to compile directly on Raspberry pi, but it fails because of insufficient VM, even after I increased it to 4096M. I'll move to cross-compilation next.
Error Message
WARNING: The following packages cannot be authenticated! autoconf E: There are problems and -y was used without --force-yes The command '/bin/sh -c /work/deb_ubuntu_ccache.sh' returned a non-zero code: 100 Traceback (most recent call last): File "ci/build.py", line 454, in
sys.exit(main())
File "ci/build.py", line 364, in main
num_retries=args.docker_build_retries, no_cache=args.no_cache)
File "ci/build.py", line 116, in build_docker
run_cmd()
File "/Volumes/External/mxnet/ci/util.py", line 81, in f_retry
return f(*args, **kwargs)
File "ci/build.py", line 114, in run_cmd
check_call(cmd)
File "/usr/local/Cellar/python/3.7.5/Frameworks/Python.framework/Versions/3.7/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'build', '-f', 'docker/Dockerfile.build.armv7', '--build-arg', 'USER_ID=501', '--build-arg', 'GROUP_ID=20', '--cache-from', 'mxnetci/build.armv7', '-t', 'mxnetci/build.armv7', 'docker']' returned non-zero exit status 100.
To Reproduce
Just run python3 ci/build.py -p armv7
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
----------Python Info---------- Version : 3.7.5 Compiler : Clang 11.0.0 (clang-1100.0.33.12) Build : ('default', 'Nov 2 2019 10:02:20') Arch : ('64bit', '') ------------Pip Info----------- Version : 19.3.1 Directory : /Volumes/External/pyenv/tvmenv/lib/python3.7/site-packages/pip ----------MXNet Info----------- Version : 1.4.1 Directory : /Volumes/External/pyenv/tvmenv/lib/python3.7/site-packages/mxnet An error occured trying to import mxnet. This is very likely due to missing missing or incompatible library files. Traceback (most recent call last): File "", line 122, in check_mxnet
AttributeError: module 'mxnet.util' has no attribute 'get_gpu_count'
----------System Info---------- Platform : Darwin-19.0.0-x86_64-i386-64bit system : Darwin node : Catalyst-RD.local release : 19.0.0 version : Darwin Kernel Version 19.0.0: Thu Oct 17 16:17:15 PDT 2019; root:xnu-6153.41.3~29/RELEASE_X86_64 ----------Hardware Info---------- machine : x86_64 processor : i386 b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz' b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C' b'machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MDCLEAR IBRS STIBP L1DF SSBD' b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT RDTSCP TSCI' ----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0310 sec, LOAD: 0.4920 sec. Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0005 sec, LOAD: 0.5698 sec. Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0007 sec, LOAD: 0.0289 sec. Timing for D2L: http://d2l.ai, DNS: 0.0008 sec, LOAD: 0.3076 sec. Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0006 sec, LOAD: 0.1726 sec. Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0007 sec, LOAD: 0.3055 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0089 sec, LOAD: 1.1580 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0005 sec, LOAD: 0.1895 sec.