hallfjonas / LCQPow

LCQPow - A Solver for Quadratic Programs with Linear Complementarity Constraints
GNU Lesser General Public License v2.1
23 stars 6 forks source link

Issue: `QPOASES_SCHUR` option basically does not work. #8

Closed hwyao closed 5 months ago

hwyao commented 7 months ago

Problem description

When enabling the QPOASES_SCHUR option, the linking always failed.

Ways to reporduce

We can just run cmake .. -DQPOASES_SCHUR=ON in the build folder of this repo, or setup a simple enviroment running the repo as ExternalProject().

ExternalProject_Add(
   # ...
    CMAKE_ARGS 
      -DQPOASES_SCHUR=ON
)

Comparison

Expected: The linking against MATLAB shared objects runs successfully, result can be confirmed by ldd. Actual: Always error even if Matlab is in this machine.

CMake Error at CMakeLists.txt:151 (message):
  Failed to locate one of the following dependencies for Schur Complement
  Method:

  MA57 /libmwma57.so

  BLAS /libmwblas.so

  LAPACK /libmwlapack.so

  METIS /libmwmetis.so

Reason

The reason is actually simple. When inspecting CMakeLists#L103, ${MATLAB_LIBDIR} is used. However the corresponding find_package is in CMakeLists.txt#L344. So this variable will never be available.

hwyao commented 7 months ago

Fixing this problem needs more attention.

Comments 1: Originally design?

I am not very sure about the original design idea of this repo. But I am trying to move:

find_package(Matlab)
get_filename_component(
    MATLAB_LIBDIR
    ${Matlab_MEX_LIBRARY}
    DIRECTORY
)

above the first if (${QPOASES_SCHUR}) in CMakeLists.txt. It basically somehow works with some extra minor fixes (at least the building is passed) but the example like OptimizeOnCircle never works (enabling the QPOASES_SCHUR and uses QPSolver::QPOASES_SPARSE)

Preparing unit circle optimization problem...

ERROR: The subproblem solver produced an error.
Failed to solve LCQP.

My furthur (quick) debugging just failed to make a quick fix over the point here, maybe I misunderstood some part of the code design?

Here is some of my observation:

  1. Reading the code SubsolverQPOASES.cpp#L59, enabling the QPOASES_SCHUR would enable flag SOLVER_MA57, which then allow the solver to use qpSchur = qpOASES::SQProblemSchur(nV, nC);, which then goes into the repo of qpOASES. I already checked that qpSchur = qpOASES::SQProblemSchur(nV, nC); is correctly executed - this is fine.
  2. Reading CMakeLists.txt#L288 of linking ${Matlab_LIBRARIES} to ${PROJECT_NAME}-shared. I think it basically cannot work. Considering the 1 above, the LCQPow repo itself (in src folder) uses the external SQProblemSchur, so do not have any direct call over the Matlab functions here. So this linking does nothing?
  3. As an analogy of what is attempted by 1 and 2, I would expect that the repository in qpOASES (sorry I didn't have time to read it carefully) would use the Matlab functions when SOLVER_MA57 is enabled. But inspecting the ldd of both liblcqpow.so and libqpOASES.so, neither of them is linked against the Matlab object. So I think either I misunderstand it here, or the problem is exactly coming from this point (again linking issue because the Makefile is used?)
hwyao commented 7 months ago

Comment 2: Improved static library?

Although we provide both static and shared library, but static library still require linking in the old way because we only copy the shared version of qpoases and osqp.

        target_link_libraries(
            ${EXAMPLE_NAME}

            PUBLIC ${PROJECT_NAME}-shared

            #PUBLIC ${PROJECT_NAME}-static
            #PRIVATE qpoases_lib osqp_lib
        )

Copying static version of them and link it in advance would solve the question (partly). But if Matlab function is somehow used in liblcqpow.so (result based on Comments 1), the static version can never be used out-of-the-box?

hallfjonas commented 7 months ago

Thank you for posting this issue. I am working on resolving this now. Some comments:

hwyao commented 7 months ago

Some comments on 425a78f:

When rolling back on 425a78f, in my machine, running OptimizeOnUnitCircle delivers: ./bin/examples/OptimizeOnCircle: error while loading shared libraries: libmwmetis.so: cannot open shared object file: No such file or directory

But a good thing is that in this version, ldd libqpOASES.so shows a correct result (as expected is linked to Matlab objects)

> ldd libqpOASES.so 
        linux-vdso.so.1 (0x00007ffd91edc000)
        /usr/local/MATLAB/R2023b/bin/glnxa64/libmwlapack.so (0x00007f2e41e67000)
        /usr/local/MATLAB/R2023b/bin/glnxa64/libmwblas.so (0x00007f2e41e48000)
        /usr/local/MATLAB/R2023b/bin/glnxa64/libmwma57.so (0x00007f2e41e18000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f2e41bfc000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f2e41aad000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f2e41a92000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2e4189e000)
        libut.so => /usr/local/MATLAB/R2023b/bin/glnxa64/libut.so (0x00007f2e417f3000)
        libmwbinder.so => /usr/local/MATLAB/R2023b/bin/glnxa64/libmwbinder.so (0x00007f2e417d1000)
        libmwompwrapper.so => /usr/local/MATLAB/R2023b/bin/glnxa64/libmwompwrapper.so (0x00007f2e417c6000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f2e417a3000)
        libmwmetis.so => not found
        libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f2e414d9000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f2e420c6000)
        ... (and many lines)

Debug

Seems that libmwmetis.so is not found. It is an option in CMakeLists.txt#L169 but was never used in qpOASES_MAKE_ARGS links below.

Checking the dependency, libqpOASES.so -> libmwma57.so -> libmwmetis.so. And libmwma57.so is missing the corresponding rpath.

> ldd /usr/local/MATLAB/R2023b/bin/glnxa64/libmwma57.so
        linux-vdso.so.1 (0x00007ffcc555c000)
        libmwblas.so => not found
        libmwmetis.so => not found
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3b69998000)
        libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 (0x00007f3b696d0000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3b69581000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3b6938f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3b69a27000)
        libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f3b69343000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3b69328000)
> readelf -d /usr/local/MATLAB/R2023b/bin/glnxa64/libmwma57.so | head -20

Dynamic section at offset 0x2ddc0 contains 30 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libmwblas.so]
 0x0000000000000001 (NEEDED)             Shared library: [libmwmetis.so]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libgfortran.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x2000
 0x000000000000000d (FINI)               0x27e60
 0x0000000000000019 (INIT_ARRAY)         0x2edb0
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x2edb8
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x0000000000000004 (HASH)               0x260
 0x000000006ffffef5 (GNU_HASH)           0x4c0
 0x0000000000000005 (STRTAB)             0xe58
 0x0000000000000006 (SYMTAB)             0x690
 0x000000000000000a (STRSZ)              950 (bytes)

Anyway, running export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/MATLAB/R2023b/bin/glnxa64/ would temporarily solve this problem and now I can run it. (Is that some kind of Matlab design?)

hwyao commented 7 months ago

Interesting. I think I made a mistake yesterday here that I forget to clear the CMake cache after I add this line to CMakeLists:

My bad. 😟

find_package(Matlab)
get_filename_component(
    MATLAB_LIBDIR
    ${Matlab_MEX_LIBRARY}
    DIRECTORY
)

So that something is wrong with qpOASES's building process.

Deleting the build folder once and rebuild, ldd libqpOASES.so shows a correct linking towards the Matlab object as the last comment. So now, it works same as 425a78f. (Cloning the repo again in another folder over specific commit also forcely "rebuild" it once, so the problem is resolved accidentally.)

The only problem here is the missing rpath in libmwma57.so, which I think is not in scope of our ability (since it is provided by Matlab).

(Do you already rewrite your LD_LIBRARY_PATH somewhere in your .bashrc so that you don't meet this problem? :) )

hallfjonas commented 7 months ago

Ok great.

About qpOASES build process & caching issue

As far as I can remember, qpOASES originally only provided a plain Makefiile to build the project. Later on, a CMake option was integrated, but it does not contain the full functionality with linking MA57 etc. That's why I am building qpOASES with the make option. This could be the root of the issue that you had to remove the entire build directory. I'm not sure how to fix this issue.

Linking issue

I don't export the matlab directory, only the directory that contains the LCQPow/build/lib directory. The cmake process automatically creates symbolic links for the required matlab libraries in that directory. If I remember correctly, then this was my approach to not having to export the matlab directory.

hwyao commented 7 months ago

caching issue

Well I think it is difficult to be fixed because as you said. But for "normal" users this part should not be touched, so anyway this is fine.

Linking issue

Hmm, now get the reason why running export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<LCQPow-dir>/build/lib now.

Interesting. I really don't see a straightforward alternative solution. README can be updated that, running this command if you want to use sparse qpOASES. (instead of / not only when using the Matlab interface).