Open TByte007 opened 2 years ago
Hmm, you can try to recompile with USE_QNNPACK=0
? And please share build/CMakeCache.txt
I think we still test cross-compilation to ARMv7, but do not have any CI for validating compilation on armv7.
If you have a fix for the problem, please do not hesitate to submit a PR.
It's from my last test where I set export CFLAGS="-march=armv7ve+simd -mfpu=neon-vfpv4"
and both export NO_QNNPACK=1
and export QNNPACK=0
but it looks like the build system doesn't care about those being set.
This last try I even tried with NO_QNNPACK=1
but yet I see no difference in the build. NO_CUDA=1
is set yet it tried to find CUDA.
I'm not that familiar with the build system and after some looking around I decided to ask first for some pointers before going deeper in what is going on. And that semicolon at the end of armv7-a;
looks suspicious. As if it was hard coded somewhere.
I'll try USE_QNNPACK=0 now (I guess it have been changed from NO_XXX ?).
CMakeCache.txt
Nothing changed. I don't see a way (easily) stop it from trying to build QNN !
Consolidate compiler generated dependencies of target pytorch_qnnpack
[ 70%] Building ASM object confu-deps/pytorch_qnnpack/CMakeFiles/pytorch_qnnpack.dir/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S.o
And the errors are of course absolutely the same !
PS1: Could it be this ? USE_PYTORCH_QNNPACK : ON
Yes, try USE_PYTORCH_QNNPACK=0
and I think compiler is actually complaints about the following (i.e. not to something generated by cmake/setup.py)
https://github.com/pytorch/pytorch/blob/8f71e8de7ef33e0cc3c92d976aa0eedae92fa1aa/aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm_sparse/4x8c1x4-dq-packedA-aarch32-neon.S#L13
It seems to work (kind of) but i put "OFF" in place of "0". But compiling C++ on a system with less than <sarcasm>
500GB of RAM and less than 128 cores </sarcasm>
can drive you insane. Reminded me about one of the reasons I hate C++ so much. I had to mount 8GB iSCSI drive from a remote RAM drive for a SWAP and of course the LAN card is only 100MBit/s but the local swap managed to crash the system twice and the SD card is not much faster anyway. Even so I can only use 2 out of 4 cores on 1GB of RAM :) . And after more than 24 hours and fixing few "%ld" to "%d" I got here which might be linked to those "%ld"'s:
[ 98%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/python/python_list.cpp.o
In file included from /rstorage/drone/pytorch_install/pytorch/torch/csrc/jit/python/python_list.cpp:3:
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h: In instantiation of ‘pybind11::ssize_t pybind11::ssize_t_cast(const IntType&) [with IntType = long long int; pybind11::ssize_t = int]’:
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/pytypes.h:1985:68: required from ‘pybind11::list::list(SzType) [with SzType = long long int; typename std::enable_if<std::is_integral<_Tp>::value, int>::type <anonymous> = 0]’
/rstorage/drone/pytorch_install/pytorch/torch/csrc/jit/python/python_list.cpp:32:25: required from here
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h:410:35: error: static assertion failed: Implicit narrowing is not permitted.
410 | static_assert(sizeof(IntType) <= sizeof(ssize_t), "Implicit narrowing is not permitted.");
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
/rstorage/drone/pytorch_install/pytorch/third_party/pybind11/include/pybind11/detail/common.h:410:35: note: ‘(sizeof (long long int) <= sizeof (pybind11::ssize_t))’ evaluates to false
PS1: I "re cloned" the repository again just in case but I got to here again. (Somewhere something is messing with the sizes of the int types may be ?) :
/rstorage/drone/pytorch_install/pytorch/torch/csrc/utils/python_arg_parser.h: In member function ‘std::vector<double, std::allocator<double> > torch::PythonArgs::getDoublelist(int)’:
/rstorage/drone/pytorch_install/pytorch/torch/csrc/utils/python_arg_parser.h:751:82: error: format ‘%ld’ expects argument of type ‘long int’, but argument 7 has type ‘int’ [-Werror=format=]
751 | "%s(): argument '%s' must be %s, but found element of type %s at pos %ld",
| ~~^
| |
| long int
| %d
......
756 | idx + 1);
| ~~~~~~~
| |
| int
any solutions for this besides disabling ONNPACK? I just got the error trying to build the latest main for armv7-a
🐛 Describe the bug
I'm trying to build PyTorch on OprangePi PC (H3 Quad-core Cortex-A7) but for some reason I get
is that semicolon in a wrong place ? The actual -march is:
After which I get about a thousand similar errors like this: Error: selected processor does not support 'command' in ARM mode' Is there a way to force/change -march or is it a bug in the auto-detecting code of
setup.py
/ cmake(s) ?Versions
Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.1 LTS (armv7l) GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0 Clang version: Could not collect CMake version: version 3.22.1 Libc version: glibc-2.35
Python version: 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] (32-bit runtime) Python platform: Linux-5.15.72-sunxi-armv7l-with-glibc2.35 Is CUDA available: N/A CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: N/A
Versions of relevant libraries: [pip3] numpy==1.21.5 [conda] Could not collect
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel @malfet @snadampal