cgg-bern / AlgoHex

GNU Affero General Public License v3.0
21 stars 7 forks source link

Error when calling CoMiSo. #17

Open Densce opened 7 months ago

Densce commented 7 months ago

Hello, I'm new to using your code. I have successfully compiled this project in Ubuntu system. However, when I tested the example you provided, there was an error as follows:

constraints = 11646

independent constraints = 3249

exploit detected special properties: constant jacobian of equality constraints constant jacobian of in-equality constraints [deng-virtual-machine:06260] Process received signal [deng-virtual-machine:06260] Signal: Segmentation fault (11) [deng-virtual-machine:06260] Signal code: Address not mapped (1) [deng-virtual-machine:06260] Failing at address: (nil) [deng-virtual-machine:06260] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f019e7ac420] [deng-virtual-machine:06260] [ 1] /lib/libipopt.so.1(_ZN5Ipopt16IpoptApplication12OptimizeTNLPERKNS_8SmartPtrINS_4TNLPEEE+0x20) [0x7f019e1834d0] [deng-virtual-machine:06260] [ 2] /home/deng/Desktop/3Dblocks/AlgoHex/AlgoHex/build/Build/lib/libCoMISo.so(_ZN6COMISO11IPOPTSolver5solveEPNS_17NProblemInterfaceERKSt6vectorIPNS_20NConstraintInterfaceESaIS5_EE+0xe7c) [0x7f019efdf7ec] [deng-virtual-machine:06260] [ 3] ./../build/Build/bin/HexMeshing(+0x2af386) [0x562ead919386] [deng-virtual-machine:06260] [ 4] ./../build/Build/bin/HexMeshing(+0x35f853) [0x562ead9c9853] [deng-virtual-machine:06260] [ 5] ./../build/Build/bin/HexMeshing(+0x360843) [0x562ead9ca843] [deng-virtual-machine:06260] [ 6] ./../build/Build/bin/HexMeshing(+0xad90c) [0x562ead71790c] [deng-virtual-machine:06260] [ 7] ./../build/Build/bin/HexMeshing(+0x9c2c0) [0x562ead7062c0] [deng-virtual-machine:06260] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f019e389083] [deng-virtual-machine:06260] [ 9] ./../build/Build/bin/HexMeshing(+0x9cc6e) [0x562ead706c6e] [deng-virtual-machine:06260] End of error message Segmentation fault (core dumped)

I've located the error statement at line 1066 of "src\AlgoHex\Parametrization3DT_impl.hh" : "ipsol.solve(&fe_problem_timings, constraint_pointers);" . But I don't know how to modify it, so I ask for your help.

Best wishes, Deng

mheistermann commented 7 months ago

Hello! Could you give some details on the setup of your VM, i.e. Linux distribution and version, as well as library versions / how you installed them? Then we can try to reproduce this issue.

Also can you check if your VM may be running out of memory? AlgoHex can use a lot of memory, and there's no specific error checking for failed allocations. (It does seem unlikely to me that this would cause a nullptr deref on a standard Linux system, but I could imagine it happening if the allocation was attempted by one of the third-party libraries using some custom allocator).

Densce commented 7 months ago

Hello! Could you give some details on the setup of your VM, i.e. Linux distribution and version, as well as library versions / how you installed them? Then we can try to reproduce this issue.

Also can you check if your VM may be running out of memory? AlgoHex can use a lot of memory, and there's no specific error checking for failed allocations. (It does seem unlikely to me that this would cause a nullptr deref on a standard Linux system, but I could imagine it happening if the allocation was attempted by one of the third-party libraries using some custom allocator).

Hello,

Thank you for your reply! I am currently running Ubuntu 20.04.4 installed on VMware 16.0 Pro. The virtual machine is allocated 64GB of RAM and 32 cores, with a 100GB hard drive. When installing AlgoHex, I cloned the main branch from GitHub. The libraries I installed manually are GMP-6.3.0 and IPOPT-3.11.9-2.2build2, while the rest were automatically downloaded by cmake.

After running make, I tested the example you provided with the following command: ./...src/HexMeshing -i cylinder.ovm -o cylinder_out.ovm. Subsequently, I encountered the error mentioned previously. I'm unsure if it's related to the version of IPOPT. I've attached some relevant log files here for your review.

cmake_output.txt make_output.txt runlog.txt

Additionally, I am considering setting up a new virtual Ubuntu system with more allocated memory to test further. Could you please let me know the amount of memory you are currently using for your tests?

mheistermann commented 7 months ago

For the cylinder example, 64GB is definitely plenty, so it'll be something else. I'm unfortunately not sure what the maximum memory consumption on more challenging models was, our servers are pretty beefy (2TB).

For finding the cause of this issue, a debug build of AlgoHex and/or Ipopt would probably be helpful, to find out what exactly that null pointer is.

Note that your version of ipopt is very old (that Debian/Ubuntu package appears abandoned); however we have been running AlgoHex with that old Debian version before, so in principle it could work.

You could still give a recent Ipopt version a try, e.g. as install via coinbrew, like done in this branch Dockerfile: https://github.com/cgg-bern/AlgoHex/blob/dev/dockerfile-without-gurobi/Dockerfile (If you use Gurobi, you won't need bonmin)

Densce commented 7 months ago

@mheistermann

For the cylinder example, 64GB is definitely plenty, so it'll be something else. I'm unfortunately not sure what the maximum memory consumption on more challenging models was, our servers are pretty beefy (2TB).

For finding the cause of this issue, a debug build of AlgoHex and/or Ipopt would probably be helpful, to find out what exactly that null pointer is.

Note that your version of ipopt is very old (that Debian/Ubuntu package appears abandoned); however we have been running AlgoHex with that old Debian version before, so in principle it could work.

You could still give a recent Ipopt version a try, e.g. as install via coinbrew, like done in this branch Dockerfile: https://github.com/cgg-bern/AlgoHex/blob/dev/dockerfile-without-gurobi/Dockerfile (If you use Gurobi, you won't need bonmin)

@mheistermann Thank you very much!

It appears that the issue may not be related to memory constraints or the version of IPOPT. It's possible that there could be conflicts between AlgoHex's dependencies and other libraries already installed on the system.

Setting that aside for now, I am currently trying to run the docker-without-gurobi version in Docker, hoping it operates smoothly! If I identify the cause, I will report back promptly.

Thank you once again for your assistance!