Closed sebmestrallet closed 1 year ago
Thank you very much for this bug report!
Regarding the crash: Could you share the input mesh (and maybe the IGM, saved with --igm-out-path
) with us for reproduction? At first glance, this seems to be a bug in HexEx.
Regarding the CMake warnings, I just checked CoMISo's CMakeLists (which causes that output). PETSC or MPI not found!
and TAO or PETSC or MPI not found!
messages are to be interpreted as (PETSC not found) or (MPI not found)
, instead of NOT (MPI found or PETSC found)
. Therefore, even if it successfully found MPI, it may still print those. You can safely ignore those warnings.
I executed
HexMeshing
on a (full!) tetrahedral mesh (VTK 2.0) and I had the following error a few processing iterations after#####Extracting hexmesh...
:HexMeshing: AlgoHex/external/OpenVolumeMesh/src/OpenVolumeMesh/Core/TopologyKernel.cc:2150: OpenVolumeMesh::HalfEdgeHandle OpenVolumeMesh::TopologyKernel::next_halfedge_in_halfface(OpenVolumeMesh::HalfEdgeHandle, OpenVolumeMesh::HalfFaceHandle) const: Assertion `_hfh.is_valid() && (size_t)_hfh.idx() < faces_.size() * 2u' failed.
The full logs are here logs.txt
This time I didn't disable MPI, the configure command was:
cmake .. -DGUROBI_HOME=~/.local/Gurobi/gurobi1003/linux64 -DCMAKE_BUILD_TYPE=Debug
But there was confusing printings about MPI:
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so CMake Warning (dev) at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:438 (message): The package name passed to `find_package_handle_standard_args` (MPI_CXX) does not match the name of the calling package (MPI). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. Call Stack (most recent call first): external/CoMISo/cmake/FindMPI.cmake:568 (find_package_handle_standard_args) external/CoMISo/CMakeLists.txt:138 (find_package) This warning is for project developers. Use -Wno-dev to suppress it. -- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so -- Found OpenMP_C: -fopenmp (found version "4.5") -- Found OpenMP_CXX: -fopenmp (found version "4.5") -- Found OpenMP: TRUE (found version "4.5") -- PETSC or MPI not found! -- TAO or PETSC or MPI not found!
MPI_C and MPI_CXX are found, but not MPI? Does your CMake output contain
TAO or PETSC or MPI not found!
? Despite it all, the configuration is set to be successful. But MPI does not seem related to the above-mentioned error.I'm on Ubuntu 22.04.3 with CMake 3.22.1
Looks like GUROBI wasn't detected on your system. In this case, QGP wasn't involved in the pipeline. You might get an invalid IGM in the end and somehow HexEx crashed ...
In the README, when you say -DGUROBI_HOME=/path/to/gurobi
, down to what depth should I give the path?
I tried with the parent folder, and with the lib
subfolder. ~In both cases CMake writes Found no Gurobi library version
. So my first setting was the right one.~
external/CoMISo/cmake/FindGurobi.cmake
wants the folder having include/
inside. I still have Found no Gurobi library version
If I remember right, by default gurobi was installed to /opt/gurobiX
, where X is some kind of version info. Then /opt/gurobiX/linux64
would be the correct path.
We're just investigating this here, there may be a problem with the gurobi finder.
We fixed a bug in the CoMISo Gurobi finder (https://gitlab.vci.rwth-aachen.de:9000/CoMISo/CoMISo/-/merge_requests/90), which is already merged in the CoMISo cgg2
branch that is used by AlgoHex.
AlgoHex automatically downloads CoMISo and should update to this; if it doesnt, you might have to remove the external
folder to force a re-download.
Please let us know if this fixed the issue :)
We'll add some extra warning output for the case you encountered to let the user know they're not getting the full AlgoHex pipeline (e.g., without QGP3D).
Here is the end of the program execution. This time the MPI call before init error was triggered, not the assertion error.
#####Parametrizing (complete pipeline)...
#####Parametrizing...
#quantization path constraints = 0
Warning: COMISOSolver received a problem with non-constant hessian!!!
Initital dimension: 13527 x 13527, number of constraints: 3159, number of integer variables: 0, use reordering: yes
integer variables #: 0
continuous variables #: 10368
Timings:
Gauss Elimination 0.034675999999999929 s
System Elimination 0.14588300000000001 s
Mi-Solver 0.85377400000000003 s
Resubstitution 0.0042450000000000005 s
Total 1.038578
-----------> parametrization finished with #invalid_tets = 0 and #invalid_edge_valencies = 0
degeneracy sequence: (#inv-tets = 0#inv-edges = 0)
import_frames_from_parametrization...
#####Parametrizing (robust quantization pipeline)...
sizing scale factor for quantization = 1.00000
Set parameter Username
Set parameter TimeLimit to value 300
Academic license - for non-commercial use only - expires 2024-09-20
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)
CPU model: 12th Gen Intel(R) Core(TM) i5-12600H, instruction set [SSE2|AVX|AVX2]
Thread count: 16 physical cores, 16 logical processors, using up to 16 threads
Optimize a model with 135 rows, 70 columns and 244 nonzeros
Model fingerprint: 0xa3311031
Model has 27 quadratic objective terms
Variable types: 0 continuous, 70 integer (0 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [2e-01, 8e+01]
QObjective range [2e+00, 4e+00]
Bounds range [0e+00, 0e+00]
RHS range [9e-01, 9e-01]
Loaded user MIP start with objective 3.41805
Presolve removed 116 rows and 52 columns
Presolve time: 0.01s
Presolved: 19 rows, 18 columns, 43 nonzeros
Presolved model has 19 quadratic objective terms
Variable types: 0 continuous, 18 integer (0 binary)
Root relaxation: objective 1.913013e+00, 5 iterations, 0.00 seconds (0.00 work units)
Another try with MIP start
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 1.91301 0 18 3.41805 1.91301 44.0% - 0s
0 0 2.58391 0 18 3.41805 2.58391 24.4% - 0s
0 2 2.58391 0 18 3.41805 2.58391 24.4% - 0s
Explored 10 nodes (43 simplex iterations) in 0.02 seconds (0.00 work units)
Thread count was 16 (of 16 available processors)
Solution count 1: 3.41805
Optimal solution found (tolerance 1.00e-04)
Best objective 3.418054003320e+00, best bound 3.418054003320e+00, gap 0.0000%
#hexahedra after QGP3D quantization = 9481
#invalid path constraints = 0, #valid path constraints = 15
sizing scale factor: 1.00000
#quantization path constraints = 45
--- initial energies ---
Symmetric Dirichlet = 12.09823 (#elements = 18972)
sum = 12.09823
-------------------------
#constraints = 11538
#independent constraints = 3159
exploit detected special properties: *constant jacobian of equality constraints* *constant jacobian of in-equality constraints*
*** The MPI_Comm_f2c() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
May not be related, but every time the program seems to freeze after printing ###ERROR message on numerical issues in KKT system can be ignored!
(~45min before the next printing). Total time is 1h 13m for 13k tetrahedra.
Could you try the test-mpiinit
branch?
On our Debian and MacOS machine, it somehow works without MPI_Init. We suspect this may depend on different versions of ipopt or mumps that are compiled with real or "fake" MPI, cf. https://coin-or.github.io/Ipopt/INSTALL.html
By the way, you should know that the speed difference between Release and Debug builds is enormous, if you want to see if it runs through. It may also make sense to save intermediate results with the appropriate commandline options to avoid having to re-run the long first part of the pipeline. Final note: according to @HengLiuNotAvailable, that KKT system error you are getting is indeed harmless and can be ignored.
No more MPI error, but the assertion failed
Did you happen to save the IGM? That would be perfect for debugging this HexEx crash.
Here you go : IGM.txt
No more MPI error, but the assertion failed
Logs (cropped)
The IGM is invalid. Can you post the parameterization part in your log file?
I had to re-run it because last time I was limited by the max size of my console. stderr.txt stdout.txt
No ohoh
nor failed to find alpha0NextFace
this time (?).
I tried in release mode, the hex-mesh is successfully generated. Maybe you can send me the B0 mesh converted to OVM, so I can check if there is a issue with the topology construction in the VTK reader on my side?
It appears that I have missing connections/cells in the B0 hex-mesh. Is it an expected behaviour for this kind of shape? Is it a consequence of an invalid IGM? Can the invalid IGM be related to bad HalfFaceHandle?
It appears that I have missing connections/cells in the B0 hex-mesh. Is it an expected behaviour for this kind of shape? Is it a consequence of an invalid IGM? Can the invalid IGM be related to bad HalfFaceHandle?
When the IGM is invalid, the extracted hex mesh usually is broken as you showed. But from the stderr.txt file you attached earlier, I see "-----------> robust quantization parametrization finished with #invalid_tets = 0 and #invalid_edge_valencies = 0" which means the IGM is actually valid. There probably is a bug in HexEx that we need look into. Just for me to understand, it works fine for you in release mode but there's an assertion error in hex mesh extraction step in debug mode, right?
Yes, this is what I ended up with.
This may not help you, but HexEx works well on my PC for this simple polycube-based hex-mesh extraction.
Yes, this is what I ended up with.
We fixed the "feature transfer" bug in HexEx. Could you try again?
I deleted build & external
folders, re-launched CMake, re-compiled, re-run in debug & release. I no longer have an assertion error, but hex meshes still have missing cells, on B0 and another simple shape.
CMake downloaded the fixed libHexEx:
-- AlgoHex: downloading missing dependency 'libHexEx'
-- Downloading/updating libHexEx
CMake Deprecation Warning at CMakeLists.txt:4 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/catB/sm266019/.local/AlgoHex/external/.cache/libHexEx
[ 11%] Performing update step for 'libHexEx-download'
warning: redirection vers https://gitlab.vci.rwth-aachen.de:9000/HexEx/libHexEx.git/
HEAD est maintenant sur 8a0ddd3 get rid of adjacent_halfface_in_cell function in finding halfface
[ 22%] No patch step for 'libHexEx-download'
[ 33%] No configure step for 'libHexEx-download'
[ 44%] No build step for 'libHexEx-download'
[ 55%] No install step for 'libHexEx-download'
[ 66%] No test step for 'libHexEx-download'
[ 77%] Completed 'libHexEx-download'
[100%] Built target libHexEx-download
Configuring HexEx inside another cmake project...
Not building with local OpenVolumeMesh
Execution: stderr.txt stdout.txt
I deleted build &
external
folders, re-launched CMake, re-compiled, re-run in debug & release. I no longer have an assertion error, but hex meshes still have missing cells, on B0 and another simple shape.CMake downloaded the fixed libHexEx:
-- AlgoHex: downloading missing dependency 'libHexEx' -- Downloading/updating libHexEx CMake Deprecation Warning at CMakeLists.txt:4 (cmake_minimum_required): Compatibility with CMake < 2.8.12 will be removed from a future version of CMake. Update the VERSION argument <min> value or use a ...<max> suffix to tell CMake that the project does not need compatibility with older versions. -- Configuring done -- Generating done -- Build files have been written to: /home/catB/sm266019/.local/AlgoHex/external/.cache/libHexEx [ 11%] Performing update step for 'libHexEx-download' warning: redirection vers https://gitlab.vci.rwth-aachen.de:9000/HexEx/libHexEx.git/ HEAD est maintenant sur 8a0ddd3 get rid of adjacent_halfface_in_cell function in finding halfface [ 22%] No patch step for 'libHexEx-download' [ 33%] No configure step for 'libHexEx-download' [ 44%] No build step for 'libHexEx-download' [ 55%] No install step for 'libHexEx-download' [ 66%] No test step for 'libHexEx-download' [ 77%] Completed 'libHexEx-download' [100%] Built target libHexEx-download Configuring HexEx inside another cmake project... Not building with local OpenVolumeMesh
Execution: stderr.txt stdout.txt
It looks like an error in the quantization step. Could you allow QGP3D_ENABLE_LOGGING in CMAKE and try to run again? It would be helpful to see what the problem there is. From the current log file, likely your gurobi license is expired. But, let's see...
Strange: Gurobi error message: HostID mismatch (licensed to e1429a7c, hostid is e1429a7b)
The host ID is managed by Gurobi itself with the provided grbgetkey
executable.
On my Gurobi account, my license is indeed registered for host ID e1429a7c
:
There is no possible mix-up, my older license was for my previous PC, the host ID is completely different (and Gurobi don't allow multiple PC for the same license):
Do you have something else to check before, or should I get in touch with the Gurobi support?
Strange:
Gurobi error message: HostID mismatch (licensed to e1429a7c, hostid is e1429a7b)
The host ID is managed by Gurobi itself with the provided
grbgetkey
executable.On my Gurobi account, my license is indeed registered for host ID
e1429a7c
:There is no possible mix-up, my older license was for my previous PC, the host ID is completely different (and Gurobi don't allow multiple PC for the same license):
Do you have something else to check before, or should I get in touch with the Gurobi support?
You had a working license from the previous log file (5 days ago). I have no idea what happened later about the license. Yeah, maybe you should contact Gurobi support.
The host ids are nearly identical, so I guess there is some minute apparent change in your configuration that Gurobi detected, maybe the order of 2 network adapters. As you're using the academic licenses, I think you should just be able to generate a new one for the same computer with the new host id. Good luck :)
FYI here are useful links to the Gurobi help center:
Yesterday, at the office, I tried to re-link my Guroby installation to the same license, and I worked (both license activation & AlgoHex). Today, at home, AlgoHex failed, Gurobi detected a HostID mismatch, so I generated another license. I think I need to switch licenses depending on the network. I just need to find a way to check the HostID without launching AlgoHex.
Thank you for your help and the quick bug fixes!
I executed
HexMeshing
on a (full!) tetrahedral mesh (VTK 2.0) and I had the following error a few processing iterations after#####Extracting hexmesh...
:The full logs are here logs.txt
This time I didn't disable MPI, the configure command was:
But there was confusing printings about MPI:
MPI_C and MPI_CXX are found, but not MPI? Does your CMake output contain
TAO or PETSC or MPI not found!
? Despite it all, the configuration is set to be successful. But MPI does not seem related to the above-mentioned error.I'm on Ubuntu 22.04.3 with CMake 3.22.1