Error: unknown architecture `armv7-a;' and Error: selected processor does not support `command' in ARM mode

TByte007 commented 2 years ago

🐛 Describe the bug

I'm trying to build PyTorch on OprangePi PC (H3 Quad-core Cortex-A7) but for some reason I get

Error: unknown architecture `armv7-a;'

is that semicolon in a wrong place ? The actual -march is:

$:> gcc -c -Q -march=native --help=target | grep march
  -march=                               armv7ve+simd

After which I get about a thousand similar errors like this: Error: selected processor does not support 'command' in ARM mode' Is there a way to force/change -march or is it a bug in the auto-detecting code of setup.py / cmake(s) ?

/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S: Assembler messages:
/rstorage/drone/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: unknown architecture `armv7-a;'
/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: selected processor does not support `vpush {d8-d15}' in ARM mode
/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: selected processor does not support `vld1.8 {d16[]},[r7]' in ARM mode
/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: selected processor does not support `veor q10,q10,q10' in ARM mode
/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: selected processor does not support `vld1.8 {d17[]},[r4]!' in ARM mode
/rstorage/pytorch_install/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S:471: Error: selected processor does not support `vld1.8 {d0},[r6]!' in ARM mode

Versions

Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.1 LTS (armv7l) GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.35

Python version: 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] (32-bit runtime) Python platform: Linux-5.15.72-sunxi-armv7l-with-glibc2.35 Is CUDA available: N/A CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: N/A

Versions of relevant libraries: [pip3] numpy==1.21.5 [conda] Could not collect

cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel @malfet @snadampal

malfet commented 2 years ago

Hmm, you can try to recompile with USE_QNNPACK=0? And please share build/CMakeCache.txt I think we still test cross-compilation to ARMv7, but do not have any CI for validating compilation on armv7.

If you have a fix for the problem, please do not hesitate to submit a PR.

TByte007 commented 2 years ago

It's from my last test where I set export CFLAGS="-march=armv7ve+simd -mfpu=neon-vfpv4" and both export NO_QNNPACK=1 and export QNNPACK=0 but it looks like the build system doesn't care about those being set. This last try I even tried with NO_QNNPACK=1 but yet I see no difference in the build. NO_CUDA=1 is set yet it tried to find CUDA. I'm not that familiar with the build system and after some looking around I decided to ask first for some pointers before going deeper in what is going on. And that semicolon at the end of armv7-a; looks suspicious. As if it was hard coded somewhere. I'll try USE_QNNPACK=0 now (I guess it have been changed from NO_XXX ?). CMakeCache.txt

TByte007 commented 2 years ago

Nothing changed. I don't see a way (easily) stop it from trying to build QNN ! Consolidate compiler generated dependencies of target pytorch_qnnpack [ 70%] Building ASM object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S.o And the errors are of course absolutely the same ! PS1: Could it be this ? USE_PYTORCH_QNNPACK : ON

malfet commented 2 years ago

Yes, try USE_PYTORCH_QNNPACK=0 and I think compiler is actually complaints about the following (i.e. not to something generated by cmake/setup.py) https://github.com/pytorch/pytorch/blob/8f71e8de7ef33e0cc3c92d976aa0eedae92fa1aa/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S#L13

TByte007 commented 2 years ago

It seems to work (kind of) but i put "OFF" in place of "0". But compiling C++ on a system with less than <sarcasm>500GB of RAM and less than 128 cores </sarcasm> can drive you insane. Reminded me about one of the reasons I hate C++ so much. I had to mount 8GB iSCSI drive from a remote RAM drive for a SWAP and of course the LAN card is only 100MBit/s but the local swap managed to crash the system twice and the SD card is not much faster anyway. Even so I can only use 2 out of 4 cores on 1GB of RAM :) . And after more than 24 hours and fixing few "%ld" to "%d" I got here which might be linked to those "%ld"'s:

[ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_list.cpp.o
In file included from /rstorage/drone/pytorch_install/pytorch/torch/csrc/jit/python/python_list.cpp:3:
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h: In instantiation of ‘pybind11::ssize_t pybind11::ssize_t_cast(const IntType&) [with IntType = long long int; pybind11::ssize_t = int]’:
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/pytypes.h:1985:68:   required from ‘pybind11::list::list(SzType) [with SzType = long long int; typename std::enable_if<std::is_integral<_Tp>::value, int>::type <anonymous> = 0]’
/rstorage/drone/pytorch_install/pytorch/torch/csrc/jit/python/python_list.cpp:32:25:   required from here
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h:410:35: error: static assertion failed: Implicit narrowing is not permitted.
  410 |     static_assert(sizeof(IntType) <= sizeof(ssize_t), "Implicit narrowing is not permitted.");
      |                   ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h:410:35: note: ‘(sizeof (long long int) <= sizeof (pybind11::ssize_t))’ evaluates to false

PS1: I "re cloned" the repository again just in case but I got to here again. (Somewhere something is messing with the sizes of the int types may be ?) :

/rstorage/drone/pytorch_install/pytorch/torch/csrc/utils/python_arg_parser.h: In member function ‘std::vector<double, std::allocator<double> > torch::PythonArgs::getDoublelist(int)’:
/rstorage/drone/pytorch_install/pytorch/torch/csrc/utils/python_arg_parser.h:751:82: error: format ‘%ld’ expects argument of type ‘long int’, but argument 7 has type ‘int’ [-Werror=format=]
  751 |           "%s(): argument '%s' must be %s, but found element of type %s at pos %ld",
      |                                                                                ~~^
      |                                                                                  |
      |                                                                                  long int
      |                                                                                %d
......
  756 |           idx + 1);
      |           ~~~~~~~
      |               |
      |               int

pelinski commented 2 months ago

any solutions for this besides disabling ONNPACK? I just got the error trying to build the latest main for armv7-a

pytorch / pytorch

Error: unknown architecture `armv7-a;' and Error: selected processor does not support `command' in ARM mode #86989

🐛 Describe the bug

Versions