ferrandi / PandA-bambu

PandA-bambu public repository
GNU General Public License v3.0
244 stars 48 forks source link

Build error on arch linux. libsuitesparseconfig not found #29

Closed TheZoq2 closed 3 years ago

TheZoq2 commented 3 years ago

Hi!

I'm trying to build the tool from git on arch linux by following the instructions in the INSTALL file.

I skipped step 1 as I don't intend to install into /opt, and don't want to change the owner of that system dir to my local user.

All dependencies are installed as per the ArchLinux package list, however

../configure --prefix=$HOME/panda --enable-verilator --enable-glpk --enable-opt --enable-flopoco --with-opt-level=fast

fails with

checking for SuiteSparse_malloc in -lsuitesparseconfig... no
Checking for libsuitesparseconfig... no
configure: error: "libsuitesparseconfig not found"

The full configure output is here https://gist.github.com/TheZoq2/2995aee7ea2b35a5729b1528fea6ac9e

The suitesparse package is installed, and the relevant packages are in /usr/lib

/usr/lib/libsuitesparseconfig.so
/usr/lib/libsuitesparseconfig.so.5.8.1
/usr/lib/libsuitesparseconfig.so.5

Gcc can also link against the library when building an empty main file

> gcc main.c -lsuitesparseconfig

runs without error

Am I missing something here? I'm not super familiar with autoconf

fabrizioferrandi commented 3 years ago

Hi! The failing macro is the one checking the presence of SuiteSparse_malloc in library suitesparseconfig. Is the main.c you tested calling function SuiteSparse_malloc? One way to understand what is going on in your distribution is to reproduce the issue with a VM. Under directory etc/VMs/PandA-bambu-VM_64bit-ArchLinux/ there is a Vagrant script that we used to test past PandA/bambu version on ArchLinux. This is a bit outdated but it should be a good starting point to reconstruct your exact condition. Finally, I would suggest starting from the panda-0.9.7-dev branch since there are many fixes to the source code. Thanks, Fabrizio

TheZoq2 commented 3 years ago

Thanks for the quick reply!

It looks like calling SuiteSparse_malloc does work, it gives an implicit declaration warning as expected, but compiles and runs fine:

/tmp ♦ ➔ cat main.c                                                                                                                             poltergeist
int main() {
    SuiteSparse_malloc();
}
/tmp ♦ ➔ gcc main.c -lsuitesparseconfig                                                                                                         poltergeist
main.c: In function ‘main’:
main.c:2:5: warning: implicit declaration of function ‘SuiteSparse_malloc’ [-Wimplicit-function-declaration]
    2 |     SuiteSparse_malloc();
      |     ^~~~~~~~~~~~~~~~~~
/tmp ♦ ➔ ./a.out

Thanks for the tip on using the panda-0.9.7 branch, I tried with both it and main with the same result, but I'll keep using the dev branch then :) I'll also look into the VM

fabrizioferrandi commented 3 years ago

Another thing you could look into is the config.log created by the configure in the build directory. There you should find the error preventing the check to pass. Cheers, Fabrizio

TheZoq2 commented 3 years ago

Ah right, looks like my test for the SuitesParse_malloc was flawed. The c compiler finds it but not c++.

configure:32326: g++ -o conftest  --std=c++17  -DNDEBUG    conftest.cpp -lsuitesparseconfig  -lz -lm  >&5
/usr/bin/ld: /tmp/cccNxvYy.o: in function `main':
conftest.cpp:(.text+0x5): undefined reference to `SuiteSparse_malloc()'

Compiling the test code it seems to use also produces the same result

/* Override any GCC internal prototype to avoid an error.
   Use char because int might match the return type of a GCC
   builtin and then its argument prototype would still apply.  */
char SuiteSparse_malloc ();
#ifdef F77_DUMMY_MAIN

#  ifdef __cplusplus
     extern "C"
#  endif
   int F77_DUMMY_MAIN() { return 1; }

#endif
int
main (void)
{
return SuiteSparse_malloc ();
  ;
  return 0;
}

I guess I'll have to look into why that function isn't in this lib

TheZoq2 commented 3 years ago

Looking into this, it seems like some things have changed in suitesparse 4.3 https://savannah.gnu.org/bugs/?43063 and there is at least a mention of the Suitesparse_malloc in those comments.

What suitesparse version is this built for?

fabrizioferrandi commented 3 years ago

Hi, libsuitesparse is required by GPLK. We use GPLK as a MILP solver for the SDC scheduling. So for bambu what matter is GLPK not suitesparse.

fabrizioferrandi commented 3 years ago

One thing more. To check the presence of suitesparse the test done by the configure is the following: AC_CHECK_LIB([suitesparseconfig],[SuiteSparse_malloc], [echo "Checking for libsuitesparseconfig... yes"], [echo "Checking for libsuitesparseconfig... no"; AC_MSG_ERROR("libsuitesparseconfig not found")])

In case for some reason, the SuiteSparse_malloc has been dropped we may change the check by substituting the function SuiteSparse_malloc with SuiteSparse_version for example.

fabrizioferrandi commented 3 years ago

One more suggestion. PandA/bambu can be compiled even with Clang. You just need to pass to the configure CXX=clang++ CC=clang. Maybe this helps to skip the issue.

TheZoq2 commented 3 years ago

Thanks for the advice. I'm finally back to trying this again.

Unfortunately, clang does not seem to make any difference.

I also looked into what symbols are actually in the libsparseconfig file on my system. Turns out that SuiteSparse_malloc is there. After some debugging, I came across https://stackoverflow.com/questions/40431563/finding-the-root-cause-of-undefined-reference-error which gave me a hint that something is wrong with c/c++ interop.

Sure enough, the conftest.cpp file which looks like this:

/* end confdefs.h.  */

/* Override any GCC internal prototype to avoid an error.
   Use char because int might match the return type of a GCC
   builtin and then its argument prototype would still apply.  */
char SuiteSparse_malloc ();
#ifdef F77_DUMMY_MAIN

#  ifdef __cplusplus
     extern "C"
#  endif
   int F77_DUMMY_MAIN() { return 1; }

#endif
int
main (void)
{
return SuiteSparse_malloc ();
  ;
  return 0;
}

defines a C++ symbol called SuiteSparse_malloc while the symbol in the library seems to be a C symbol. Adding extern "C" to the definition makes that code compile, which also explains why the test code was buildable when built as C on friday.

Edit: This seems to be a consistent project throughout, I get similar errors for -lamd, -lcolamd, -lltdl, -lglpk when I sequentially add extern C to the configure file

fabrizioferrandi commented 3 years ago

Ok, the issue is that the test is done using g++. The solution should be to wrap the AC_CHECK_LIB with AC_LANG_PUSH and AC_LANG_POP

      AC_LANG_PUSH([C])
      AC_CHECK_LIB([suitesparseconfig],[SuiteSparse_malloc], [echo "Checking for libsuitesparseconfig... yes"], [echo "Checking for libsuitesparseconfig... no"; AC_MSG_ERROR("libsuitesparseconfig not found")])
      AC_LANG_POP([C])

Could you please check if this is going to work or not?

TheZoq2 commented 3 years ago

Yep, that seems to have done it, though I wrapped the whole GLPK test block in it because I needed extern "C" for everything in there.

Thanks for the help!

Also, sidenote: is it normal for the configure script to run make and make install? I'm used to doing ./configure && make && sudo make install to install globally, but thats not possible here

fabrizioferrandi commented 3 years ago

Yep, that seems to have done it, though I wrapped the whole GLPK test block in it because I needed extern "C" for everything in there.

Thanks for the help!

I'll fix the issue in the branch and later in the main repository.

Also, sidenote: is it normal for the configure script to run make and make install? I'm used to doing ./configure && make && sudo make install to install globally, but thats not possible here

Not sure to what script you are referring sudo make install is possible. I tend to put the bambu stuff under a user directory but all the "standard" flavors are supported.

TheZoq2 commented 3 years ago

Not sure to what script you are referring

Running ../configure without specifying a prefix seems to call make install at some point, which results in errors when it tries to write to /usr. Ideally I'd like configure and make to not require root priviliges, and it only being required for make install

fabrizioferrandi commented 3 years ago

You are right. make install is done when --enable-flopoco is passed to the configure. In such a case sollya and libfplll-4.0.3 has to be already installed before flopoco will be configured. In case you are not enabling flopoco the make install is not called during the configure.

TheZoq2 commented 3 years ago

Ah, makes sense. Thanks!

fabrizioferrandi commented 3 years ago

Hi @TheZoq2

Revision rev in branch panda-0.9.7-dev should fix the issue. Cheers, Fabrizio