NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.92k stars 13.95k forks source link

Default integer size in Fortran #35208

Open markuskowa opened 6 years ago

markuskowa commented 6 years ago

Issue description

In Fortran the default size of an integer can be freely chosen to be either 4 or 8 byte. This leads to an ambiguity in API calls, which can not be easily seen when linking against a Fortran library. For example one can build an application with 8 byte default integer size and link against library with 4 byte integer size. A derivation which does that will build flawlessly but crash on run time (unless a test catches the mistake). This lead for example to an issue with arpack (https://github.com/NixOS/nixpkgs/issues/33921). For many derivations it is not clearly visible if they are build with 4 or 8 byte integers and most of the time even only one version exists. However, to guarantee that an application is build properly one has to make sure that (1) all inputs are build with the same convention and (2) that the derivation knows which convention the build inputs are using. For example a C program that links against a fortran library does not use the fortran compiler as input but it needs to know which convention has been used.

The question now is how would one solve this in a clean nix way? It would be desirable to avoid or catch this type of problem already at build time.

Proposal

  1. Decide what the default should be. On x86_64-linux the default setting of the compiler is 4 byte integers. However, openblas at the moment assumes per default 8 byte integers and calls its 4 byte version openblasCompat.
  2. Create a wrapper for gfortran which changes the default to 8 byte (-fdefault-integer-8) By changing the input parameter of a derivation one can enforce either a 4 or 8 byte build. Although some derivations set this choice through configure this would at least cover some cases (and set the default on build time properly).
  3. Introduce a flag that indicates how a package was built. Openblas already does that by setting blas64. I propose to replace that flag by a more general flag like e.g. ilp64 (https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models) and use it in all derivations that use gfortran. For blas/lapack this has been solved in https://github.com/NixOS/nixpkgs/pull/83888
  4. Using the above features, build all packages in two versions (or at least allow for it by override) Some packages might not work properly in the 8 byte version without patching (?)

Any thoughts on what would be the best way to resolve this ambiguity?

knedlsepp commented 4 years ago

Just stumbled over this while trying to fix fenics, which (among other issues) has both openblas and openblasCompat in its closure, leading to a segfault. Maybe we could do something like a setup-hook that checks for conflicts in your 3. proposal.

knedlsepp commented 4 years ago

(Related: Seems numpy/scipy can be moved to blas64 soonish: https://github.com/scipy/scipy/pull/11193)

markuskowa commented 4 years ago

@knedlsepp I almost forgot about this issue but the problem still exists. However, I don't know yet what the best solution could be. A setup hook is one possibility. Another possibility is a library function that tests for compatibility and throws an error in case the test fails.

Also to consider here: not every Fortran code can be compiled with int64 (e.g. MPI is always int32, scalapack does not work with int64)

risicle commented 4 years ago

I've just run into this problem with opencv. opencv's python bindings depend on numpy... which is 4-byte only. But we're feeding 8-byte openblas to opencv. This leads to segfaults when you need to use opencv and numpy.linalg in the same program.

markuskowa commented 4 years ago

@risicle The blas/lapack switching mechanism has been changed recently, which addresses parts of this problem.

risicle commented 4 years ago

That actually seems to solve the problem on master, not that I exactly understand how. I suppose the "fix" that may be in order here is to make opencv use openblasCompat on the stable branches.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

markuskowa commented 3 years ago

While this issue has solution for blas/lapack, there are probably still other cases, where ILP64 vs. LP64 causes problems.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

markuskowa commented 2 years ago

Improvements: make more use of the isILP64 flag to indicate how a blas/lapack version (and its consumer) was built. See e.g.:

markuskowa commented 1 year ago

Some more improvements: