Open neilwang0913 opened 2 years ago
Same problem here.
sh example.sh ✔
-- The CXX compiler identification is GNU 11.3.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/cuda/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- The CUDA compiler identification is NVIDIA 11.7.99
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /opt/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
/usr/lib/python3.10/site-packages/torch
-- pybind11 v2.6.3 dev1
CMake Warning (dev) at /var/lib/snapd/snap/cmake/1156/share/cmake-3.24/Modules/CMakeDependentOption.cmake:89 (message):
Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
Syntax. Run "cmake --help-policy CMP0127" for policy details. Use the
cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first):
3rd/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Found PythonInterp: /usr/bin/python3.10 (found version "3.10.6") -- Found PythonLibs: /usr/lib/libpython3.10.so -- Performing Test HAS_FLTO -- Performing Test HAS_FLTO - Success -- Configuring done CMake Warning (dev) in CMakeLists.txt: Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC, empty CUDA_ARCHITECTURES not allowed. Run "cmake --help-policy CMP0104" for policy details. Use the cmake_policy command to set the policy and suppress this warning.
CUDA_ARCHITECTURES is empty for target "LMCoreKernel". This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) in CMakeLists.txt: Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC, empty CUDA_ARCHITECTURES not allowed. Run "cmake --help-policy CMP0104" for policy details. Use the cmake_policy command to set the policy and suppress this warning.
CUDA_ARCHITECTURES is empty for target "LMCoreKernel". This warning is for project developers. Use -Wno-dev to suppress it.
-- Generating done
-- Build files have been written to: /home/yunnan/repos/DeepLM/build
[ 10%] Building CUDA object CMakeFiles/LMCoreKernel.dir/TorchLM/cpp/kernel_impl.cu.o
[ 20%] Building CXX object CMakeFiles/BACore.dir/BAProblem/cpp/baproblem_manager.cc.o
[ 30%] Building CXX object CMakeFiles/BACore.dir/BAProblem/cpp/interface.cc.o
[ 40%] Building CXX object CMakeFiles/BACore.dir/BAProblem/cpp/io.cc.o
[ 50%] Building CXX object CMakeFiles/BACore.dir/BAProblem/cpp/torch_util.cc.o
[ 60%] Linking CUDA shared library libLMCoreKernel.so
[ 60%] Built target LMCoreKernel
[ 80%] Building CXX object CMakeFiles/LMCore.dir/TorchLM/cpp/kernel.cc.o
[ 80%] Building CXX object CMakeFiles/LMCore.dir/TorchLM/cpp/interface.cc.o
/home/yunnan/repos/DeepLM/BAProblem/cpp/baproblem_manager.cc: In function ‘std::vector<std::vector^~~~~~
/home/yunnan/repos/DeepLM/BAProblem/cpp/baproblem_manager.cc:64:21: warning: unused variable ‘dPtIdx’ [-Wunused-variable]
64 | const long dPtIdx = static_cast<const long>(pointIdx.storage().data());
| ^~
/home/yunnan/repos/DeepLM/BAProblem/cpp/baproblem_manager.cc:117:14: warning: variable ‘intOptions’ set but not used [-Wunused-but-set-variable]
117 | auto intOptions = torch::TensorOptions().dtype(torch::kInt64);
| ^~~~~~
/home/yunnan/repos/DeepLM/BAProblem/cpp/io.cc: In function ‘std::vector^~~~~~~
/home/yunnan/repos/DeepLM/BAProblem/cpp/io.cc:49:27: warning: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<double, std::allocator^~~~~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void JacobiColumnSquare(const std::vector^~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc:258:13: warning: unused variable ‘residualDim’ [-Wunused-variable]
258 | int residualDim = indices.size();
| ^~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void ColumnInverseSquare(std::vector^~~~~~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void JacobiNormalize(const std::vector^~~~~~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc:335:21: warning: unused variable ‘num’ [-Wunused-variable]
335 | int num = numDimV ~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void JacobiLeftMultiplyCuda(const std::vector~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void JacobiColumnSquareCuda(const std::vector^~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void ColumnInverseSquareCuda(std::vector^~~~~~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc: In function ‘void JacobiNormalizeCuda(const std::vector^~~~~~~~
/home/yunnan/repos/DeepLM/TorchLM/cpp/kernel.cc:524:21: warning: unused variable ‘num’ [-Wunused-variable]
524 | int num = numDimV * numDimP;
| ^~~
[ 90%] Linking CXX shared library BACore.cpython-310-x86_64-linux-gnu.so
[ 90%] Built target BACore
[100%] Linking CXX shared library LMCore.cpython-310-x86_64-linux-gnu.so
[100%] Built target LMCore
--2022-10-05 19:54:22-- https://grail.cs.washington.edu/projects/bal/data/ladybug/problem-49-7776-pre.txt.bz2
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving grail.cs.washington.edu (grail.cs.washington.edu)... 2607:4000:200:14::5d, 128.208.5.93
Connecting to grail.cs.washington.edu (grail.cs.washington.edu)|2607:4000:200:14::5d|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 448484 (438K) [application/x-bzip2]
Saving to: ‘problem-49-7776-pre.txt.bz2’
problem-49-7776-pre 100%[===================>] 437.97K 211KB/s in 2.1s
2022-10-05 19:54:26 (211 KB/s) - ‘problem-49-7776-pre.txt.bz2’ saved [448484/448484]
Traceback (most recent call last):
File "/home/yunnan/repos/DeepLM/examples/BundleAdjuster/bundle_adjuster.py", line 5, in
I did not run into this error. I googled it and some one offers the solution as adding "--use-cxx11-abi", with a reference link: https://pytorch.org/TensorRT/tutorials/installation.html#installation (choosing the right ABI).
Let me know if it helps :)
In the end I installed conda environment from https://github.com/CompVis/stable-diffusion, and then it magically worked.
I had to export the following:
export TCNN_CUDA_ARCHITECTURE=86
export CUDA_HOME="/usr/local/cuda-11.7"
export PATH="/usr/local/cuda-11.7/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH"
before example.sh
It solved the
import BACore
ModuleNotFoundError: No module named 'BACore'
problem, but then I got
Traceback (most recent call last):
File "examples/BundleAdjuster/bundle_adjuster.py", line 8, in <module>
from BAProblem.rotation import AngleAxisRotatePoint
ModuleNotFoundError: No module named 'BAProblem'
Dirty fix: add sys.path.append(os.path.dirname(os.path.dirname(__file__)))
to the example script.
Hi! I'm also running into an error when running the example script, and it's different from the errors in previous issues. Any suggestions of what's wrong would be very much appreciated.
TORCH_USE_RTLD_GLOBAL=YES python3 examples/BundleAdjuster/bundle_adjuster.py --balFile ./data/problem-49-7776-pre.txt --device cuda
Load observation 31000 of 31843...
Initial cost = 8.509125E+05, Memory = 2.321921E-03 G
Traceback (most recent call last):
File "examples/BundleAdjuster/bundle_adjuster.py", line 40, in <module>
numSuccessIterations = 15)
File "/home/juanmohedano/OnePose_Plus_Plus/submodules/DeepLM/TorchLM/solver.py", line 726, in Solve
solver.Solve()
File "/home/juanmohedano/OnePose_Plus_Plus/submodules/DeepLM/TorchLM/solver.py", line 617, in Solve
self.ComputeTrustRegionStep()
File "/home/juanmohedano/OnePose_Plus_Plus/submodules/DeepLM/TorchLM/solver.py", line 560, in ComputeTrustRegionStep
step = self.LinearSolve(lmDiagonal);
File "/home/juanmohedano/OnePose_Plus_Plus/submodules/DeepLM/TorchLM/solver.py", line 428, in LinearSolve
ListInvert(preconditioner)
File "/home/juanmohedano/OnePose_Plus_Plus/submodules/DeepLM/TorchLM/listvec.py", line 25, in ListInvert
listvec[i] = torch.inverse(listvec[i])
torch._C._LinAlgError: linalg.inv: (Batch element 0): The diagonal element 1 is zero, the inversion could not be completed because the input matrix is singular.
I did not run into this error. I googled it and some one offers the solution as adding "--use-cxx11-abi", with a reference link: https://pytorch.org/TensorRT/tutorials/installation.html#installation (choosing the right ABI).
Let me know if it helps :)
&
I had to export the following:
export TCNN_CUDA_ARCHITECTURE=86 export CUDA_HOME="/usr/local/cuda-11.7" export PATH="/usr/local/cuda-11.7/bin:$PATH" export LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH"
before
example.sh
It solved the
import BACore ModuleNotFoundError: No module named 'BACore'
problem, but then I got
Traceback (most recent call last): File "examples/BundleAdjuster/bundle_adjuster.py", line 8, in <module> from BAProblem.rotation import AngleAxisRotatePoint ModuleNotFoundError: No module named 'BAProblem'
Dirty fix: add
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
to the example script.
Does not work for me, any additional solving method? I ran into so many issues building oneposeplus
I ran into the same error. I try to install BACore package but I can't find the source.
I finally found that I didn't download the 3rd party file 'eigen' and 'pybind11', which is not included in this ZIP. You have to download it manually and it works.
Dirty fix: add `sys.path.append(os.path.dirname(os.path.dirname(__file__)))` to the example script.
This helped, by changing the last line of example.sh to : TORCH_USE_RTLD_GLOBAL=YES python3 ../DeepLM/examples/BundleAdjuster/bundle_adjuster.py --balFile ./data/problem-49-7776-pre.txt --device cuda
and adding
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(__file__))))
to the bundle_adjuster.py file
None of the answers worked for me. from BAProblem.rotation import AngleAxisRotatePoint
getting this error. Did anyone solve this?
I met the same proplem and i finally solved it by setting the python version correctly which can be edit in the CMakeLists.txt
like this 👇
set(PYTHOH3_VERSION 3.9m)
after setting the python version to 3.9 and run the command sh example.sh
, i find the correct python lib was compiled successfully named 'BACore.cpython-39-x86_64-linux-gnu.so' (which is BACore.cpython-38-x86_64-linux-gnu.so before) and this solve the problem of ImportError: xxx
.
@shivakarnati
i guess @Penterakt can solve the problem by installing SD's conda env mainly because the python version of it is 3.8.5 , which is compatible for the py38 version xxx.so
Please make sure the right cuda version is installed
Hi: After running the example.sh, I got an error for no: import BACore ModuleNotFoundError: No module named 'BACore'
Any idea to fix it?
Many thanks