golems / amino

Lightweight robotics utility library
http://amino.golems.org
BSD 3-Clause "New" or "Revised" License
39 stars 19 forks source link

Fortran error in compilation #22

Closed wbthomason closed 6 years ago

wbthomason commented 6 years ago

amino fails to compile for me because of (I think) Fortran version errors.

I suspect that these errors are version related because I'm on Arch Linux, which tends to have newer versions than Xenial (the Travis environment). I've skimmed the Travis logs, and it seems you're using GCC 4.8.4 there. I would be a bit surprised if enough changed in gfortran between 4.8.4 and 5.4.1 to cause compilation to break, but I know very little about Fortran, so perhaps this is the case.

Do you have any suggestions for debugging my build? I have tried using ./configure --without-fortran, but this causes other errors - gcc: error: euler.c: No such file or directory.

My versions of each of the required dependencies are:

wbthomason commented 6 years ago

As an update, I compiled gcc 4.8.4 and tried using gfortran 4.8.4 in the same way as 5.4.1, etc. above. I get the same errors. This implies to me that there's some other configuration error with my system, but given that la_implf.f90 isn't dynamically generated by any part of the build (as far as I can tell), I'm not sure where to look.

ndantam commented 6 years ago

Can you check what flags are being passed to gfortran? You may need to run with make V=1.

It sounds like the flag to allow arbitrary line lengths may have changed. But, if you reverted to gcc/gfortran 4.8.4, then it should work OK... Did you rerun ./configure after switching the gfortran version?

Alternatively, we can skip gfortran and use f2c. Using f2c as a fortran90 compiler (FC) will probably fail, though, since I think f2c only does fortran77. If you don't set the FC or F77 variables, and run ./configure --without-fortran, then the build scripts will omit the fortran90 sources and use f2c only for the fortran77 sources.

And it sounds like I ought to add Arch Linux to the integration tests.

wbthomason commented 6 years ago

Can you check what flags are being passed to gfortran? You may need to run with make V=1.

With make V=1:

/bin/sh ./libtool  --tag=FC   --mode=compile gfortran-4.8  -g -O2 -c -o la_mod.lo la_mod.f90
libtool: compile:  gfortran-4.8 -g -O2 -c la_mod.f90  -fPIC -o .libs/la_mod.o

Referencing the relevant line of the latest Travis build

/bin/bash ./libtool  --tag=FC   --mode=compile gfortran  -Wintrinsic-shadow -Wunused-parameter -Wtabs -Warray-temporaries -Wunderflow -Wimplicit-procedure -Wimplicit-interface -Wshadow -Wextra -pedantic -Wall -ffree-line-length-none -fimplicit-none -g -O2 -c -o la_mod.lo la_mod.f90

it looks like I may need to manually add back in the flags if I'm manually setting FC and F77. I could also try symlinking gfortran-4.8 to gfortran on my system as a hack to test if it's just a matter of missing flags.

It sounds like the flag to allow arbitrary line lengths may have changed. But, if you reverted to gcc/gfortran 4.8.4, then it should work OK... Did you rerun ./configure after switching the gfortran version?

I did rerun ./configure (I've actually been taking the unnecessarily drastic step of wiping out the repo and re-cloning every time I switch versions, just to make sure there's a totally clean environment. When I was using make clean between version switches, it seemed that some intermediate files stuck around).

However, I may have misread the error output from gfortran 4.8.4 when I first ran it - it is subtly different from what I get with later versions, and does not mention the line extension. I've copied the full error output here: https://gist.github.com/wbthomason/6673d68c3c95da4f477b18c6b5801150

Alternatively, we can skip gfortran and use f2c. Using f2c as a fortran90 compiler (FC) will probably fail, though, since I think f2c only does fortran77. If you don't set the FC or F77 variables, and run ./configure --without-fortran, then the build scripts will omit the fortran90 sources and use f2c only for the fortran77 sources.

I'd tried this before, unfortunately. I get the error:

libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I./include -I./include -I./src/mac -g -O2 -MT euler.lo -MD -MP -MF .deps/euler.Tpo -c euler.c  -fPIC -DPIC -o .libs/euler.o
gcc: error: euler.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
make[2]: *** [Makefile:2795: euler.lo] Error 1

so it looks like something isn't being generated when it needs to be.

And it sounds like I ought to add Arch Linux to the integration tests.

I'd be happy to help with this if I can!

Sorry for all the trouble (my Fortran knowledge could fit in the empty set), and thank you for the help! I'm going to try manually specifying the missing feature flags and seeing if that works; I'll post the results back here.

ndantam commented 6 years ago

Do you have the autoconf macro archive installed?

The build scripts use some of those macros to check which compiler flags are valid. It looks like those checks aren't happening.

wbthomason commented 6 years ago

Well, adding the flags did get me past this particular error, and onto new and exciting errors trying to compile src/rx/mp/scene_ompl.cpp due to missing function calls to OMPL functions. During the configure stage, I get warnings that ompl/base/TypedSpaceInformation.h is "present but cannot be compiled" (and a request to report the warning to you). I suspect the new errors are related to this warning.

wbthomason commented 6 years ago

I did not have the autoconf macro archive installed. Installing that does also fix the Fortran error, though not the OMPL error (as expected).

autoconf-archive and autoconf are separate packages and not bundled by default for Arch, which explains why the archive was missing.

ndantam commented 6 years ago

Ok, progress!

Can you attach the config.log with the TypedSpaceInformation.h error?

(FYI, this check is to work with older versions of OMPL that don't include some templates for managing configuration spaces)

wbthomason commented 6 years ago

Here's the config.log. The relevant lines are 3150 - 3169: config.log

Here also are the errors I get when compiling: ompl_errors.txt

ndantam commented 6 years ago

Could you please apply the attached patch and re-make? cloneState.patch.txt

wbthomason commented 6 years ago

That looks to have fixed some errors. I'm playing around with the ompl-compat files now to see if I can get the rest.

Here's the diff of the two error logs: error_diff.txt

Here's the full second log: ompl_errors2.txt

Out of curiosity, do you think this is because my OMPL is too old? I have v1.3.2 installed.

ndantam commented 6 years ago

Looks like newer GCC is stricter with templates. The patch fixed an error that older GCC had ignored.

The other errors might come from ompl's change from boost::shared_ptr to std::shared_ptr. What happens if you change every boost::shared_ptr to std::shared_ptr in ompl-compat/.../TypedSpaceInformation.h?

wbthomason commented 6 years ago

Interesting. That change does fix the OMPL errors. Now, however, SBCL complains about being unable to load LIBAMINO because it cannot open libamino.so while compiling tf-grovel.lisp.

I am using SBCL 1.4.6 vs the 1.3.1 on Travis. I'm more inclined to believe that this is a configuration error with my CL environment, though, so I'm going to play around with that and see if I can make things work.

Sorry about the cascading rabbit hole of errors here!

wbthomason commented 6 years ago

After a make clean && make, it appears I misread the earlier SBCL error. The failure to load LIBAMINO happens during evaluation of make-aarx.lisp, and is because libamino.so contains the undefined symbol aa_tf_eulerzxy2rotmat_. (I confirmed this after deleting the SBCL cache and rebuilding)

ndantam commented 6 years ago

Seems like something is not getting compiled or linked correctly.

The function aa_tf_eulerzxy2rotmat_ is in the generated source file src/mac/euler.f (fortran appends an underscore to function symbols). This file is either compiled with gfortran or translated with f2c and linked into libamino. Are we missing one of these steps?

wbthomason commented 6 years ago

I think that particular error may have been caused by having both f2c and gfortran accessible on the PATH. Uninstalling f2c gets past that error, but breaks at a very similar error - now, the missing symbol is cblas_dnrm2, which is used from src/la.c. src/la.c does get compiled into src/.libs/la.o, which is linked into .libs/libamino.so.0.0.0 as it should be. None of these steps seem to report failure.

There are a fair number more undefined symbols in libamino.so, many of which are other CBLAS functions: undefined.txt. I'm using CBLAS 3.8.0 and OpenBLAS 0.2.20. I'm not sure what version you use on Travis, but perhaps one or both of these is too new and has changed the API?

I've included the full output of make V=1; I'll keep poking through it in the morning to see if I'm just missing something. detailed_make.txt

ndantam commented 6 years ago

Ok, cblas_dnrm2 is an external library function (part of the C API for BLAS, which is natively Fortran). In Debian/Ubuntu, the CBLAS functions are in the same library as the Fortran BLAS functions. Maybe Arch uses a separate library?

wbthomason commented 6 years ago

I've fixed the last remaining error. It was due to Arch weirdness, though not that particular Arch weirdness.

Possibly uninteresting detail follows: Arch has packages cblas, which provides CBLAS, and blas and openblas, both of which provide an implementation of BLAS. I had cblas and openblas installed. The weirdness (or, rather, unexpected condition) from Arch is that openblas is configured to be linked against with -lopenblas rather than -lblas. This breaks pkg-config for cblas, as cblas is trying to link against BLAS with -lblas. This means that pkg-config --libs --cflags cblas will fail. I'm not sure if your autotools setup for amino uses pkg-config to find linker flags, but manually modifying the generated Makefile to include -lcblas in the LIBS variable did the trick and made everything happily compile.

wbthomason commented 6 years ago

To summarize this chain of errors when compiling on Arch:

  1. It seems that versions of gfortran later than 4.8.4 have changed flag names and thus fail to enable the necessary Fortran extensions.
  2. Arch doesn't include autoconf-archives with autoconf by default. Amino relies on these archives being present.
  3. Newer gcc (8.1.0) is stricter with templates than 4.8.4, and thus throws errors on syntax that was permitted with 4.8.4. The first patch fixes this.
  4. Use of Boost's shared_ptr type leads to an incomplete type error in TypedSpaceInformation.h. Replacing this with std::shared_ptr fixes the error.
  5. Having f2c and gfortran in the PATH leads to f2c being used where it should not be, and causes undefined symbols in libamino.so. Uninstalling f2c fixes this.
  6. By default, ./configure does not appear to link against CBLAS on Arch, possibly because of the aforementioned pkg-config brokenness. Manually editing the Makefile is a hacky fix for this; perhaps there's a better option? I will be filing a bug on the Arch openblas package, because it should provide blas.pc by default anyway.

With all of these fixes applied, make and sudo make install succeed. I haven't tried linking against libamino yet to confirm that everything is working, but there are at least no compile errors, and aarxc seems to run correctly.

Thank you very much for all the help! I look forward to using Amino. Please let me know if I can help with anything, or if you need more information or help reproducing one or more of these errors.

ndantam commented 6 years ago

Great, glad we sorted this out.

Thank you for summarizing the necessary steps and changes. I will update the integration tests when I return from ICRA.

biotinker commented 3 years ago

I'm currently having this same issue, on Ubuntu Focal. I confirmed the autoconf macros are installed, as that seems to be a common issue.

When running with ./configure --without-fortran I can build fine, but if I try to run ./configure --without-fortran --enable-demos, I get the following error from make V=1:

/bin/bash ./libtool  --tag=CC   --mode=link gcc  -g -O2   -o aarx.core    -lnlopt -lglpk -lpthread -ldl -lm -llapack -lblas 
libtool: link: gcc -g -O2 -o aarx.core  -lnlopt -lglpk -lpthread -ldl -lm -llapack -lblas
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x24): undefined reference to `main'
collect2: error: ld returned 1 exit status

It sounds like I would need to install an old version of gfortran, but 4.8 is not available on the latest Ubuntu LTS. Is there another way?