SCons / scons

SCons - a software construction tool
http://scons.org
MIT License
2.12k stars 320 forks source link

Fortran compile fails with module dependencies #4177

Open james-thunes opened 2 years ago

james-thunes commented 2 years ago

Describe the bug scons does not appear to be correctly find dependencies on *.mod files in Fortran files.

My project includes a Fortran library that includes a number of source files and a module including some shared variables. When compiling on windows with ifort, I get the following error when doing a clean build: error #7002: Error in opening the compiled module file. Check INCLUDE paths. Subsequent builds complete successfully.

Investigation shows that the above compiler error is seen because the .mod file required by the source files is not built before the source file compilation. Reordering the list of source files improves the situation, but it still fails periodically when building in parallel. Subsequent attempts to compile work as the .mod file is created during the first scons attempt and are thus already available.

Per suggestion from discord, I ran scons with --tree=prune to check if the consumers of the module correctly listed the *.mod file as a build dependency. The output below (anonymized) shows the output for one of the source files:

  |   +-<path>\<objName>.staticrt.obj
  |   | +-<relativePath>\<srcFile>.f
  |   | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE

Note that the source file does not list the *.mod file as a dependency.

The project is cross-platform so I also checked the result of --tree=prune on linux (compiled with gfortran). Scons also did not recognize the dependency on the mod file (output was the same as above with the exception of the compiler path). However, the code compiles without issue on linux.

Since the above shows that the source files are not identifying the *.mod file as a build dependency, my assumption is that there is some issue in the way that scons is determining dependencies for Fortran files. I'm not sure why there's not an issue with gfortran on linux.

Required information

james-thunes commented 2 years ago

I haven't been able to get a small test case set up yet. I will try to create one and post to this issue

dnwillia-work commented 2 years ago

I'll just add that when I was testing this locally it seems to only be reproducible when running a parallel build with -jN. A serial build would do things in the right order and I never saw that fail, but for parallel it's failing when a dependent source file is compiled concurrently before the module is complete.

bdbaddog commented 2 years ago

I'll just add that when I was testing this locally it seems to only be reproducible when running a parallel build with -jN. A serial build would do things in the right order and I never saw that fail, but for parallel it's failing when a dependent source file is compiled concurrently before the module is complete.

That's just luck... If parallel builds fail due to missing dependencies (for any build system), it means the build system doesn't have all the appropriate dependencies and so isn't building everything needed for the failing build step. I run into this often (though mainly with non-scons based build systems where it's harder to get this right)..

dnwillia-work commented 2 years ago

Yeah, fair enough. Even for the parallel builds I'll add that it would sometimes work, sometimes not.

I wonder since it seems to be a windows specific thing then is something else going on. eg: the .mod file is still locked by the OS so the dependent module cannot link against it. We have run into this kind of issue with .lib (import library) files on windows parallel builds before as well if an Install() was used to move the .lib file to some other place before it was linked into dependent module despite the fact that we would add Depends() on the install task to the dependent module. We removed the use of Install() targets for this as a result and it got better.

bdbaddog commented 2 years ago

Ugh.. the joys of NTFS..

bdbaddog commented 2 years ago

How are you creating your Environment()? Specifically are you setting FORTRANSUFFIXES (or any of the F*SUFFIXES env vars)?

bdbaddog commented 2 years ago

Try adding this to your Environment()

FORTRANFILESUFFIXES=['.f90'],

james-thunes commented 2 years ago

this particular library has all *.f files, but I can try setting FORTRANFILESUFFIXES=['.f'],

dnwillia-work commented 2 years ago

How are you creating your Environment()? Specifically are you setting FORTRANSUFFIXES (or any of the F*SUFFIXES env vars)?

We definitely do not configure that particular variable currently. We do configure these variables related to this issue

    env["FORTRANMODDIR"] = "${TARGET.dir}"
    env["F90PATH"] = "${TARGET.dir}"

It's the same for ifort and gfortran.

mwichmann commented 2 years ago

Okay, I've located a simple example and don't see the problem with it. Here's the tree dump:

+-.
  +-SConstruct
  +-list-module.f90
  +-list-module.o
  | +-list-module.f90
  | +-/bin/gfortran
  +-listmodule.mod
  | +-list-module.f90
  | +-/bin/gfortran
  +-main-list
  | +-main-list.o
  | | +-main-list.f90
  | | +-[listmodule.mod]
  | | +-/bin/gfortran
  | +-[list-module.o]
  | +-/bin/gfortran
  +-main-list.f90
  +-[main-list.o]

main-list.o is shown as depending on its source file, on listmodule.mod, and on the compiler. That looks right to me? What am I missing?

bdbaddog commented 2 years ago

@mwichmann - I think this is only an intel fortran issue. And I think it's due to loading default tools, then ifort and having files named .f when ifort looks like it only wants files named .i or .i90?

bdbaddog commented 2 years ago

this particular library has all *.f files, but I can try setting FORTRANFILESUFFIXES=['.f'],

Did that work?

mwichmann commented 2 years ago

I still have to look at the .i/.i90 thing, I think it should work just find on ordinary suffixes and I don't know what those actually mean.

james-thunes commented 2 years ago

this particular library has all *.f files, but I can try setting FORTRANFILESUFFIXES=['.f'],

Did that work?

Sorry, been busy with other tasks. Will try to get to this as soon as I can.

james-thunes commented 2 years ago

this particular library has all *.f files, but I can try setting FORTRANFILESUFFIXES=['.f'],

Did that work?

Ok, I was able to spend a bit of time with this today. I'm a bit confused to be honest. Setting FORTRANFILESUFFIXES=['.f'], did indeed appear to resolve the issue. I was able to see the module dependencies when looking at the output with --tree=prune. However, removing the flag and doing a clean rebuild of the code I see that the module dependencies is still seen. It definitely wasn't last week...

mwichmann commented 2 years ago

Most of this setup bemuses me, so you have company :)

Out of curiosity, are all the files in your project of the same suffix?

dnwillia-work commented 2 years ago

The library in question has a combination of both .f90 and .f suffixes since it includes F90 free format and F77 fixed format code.

mwichmann commented 2 years ago

Okay, then the muck with "dialects" isn't completely useless to you (or if it is - please feel free to let us know!).

dnwillia-work commented 2 years ago

Right, the library in question uses both dialects because it links in an older code base shared with other applications. So, definitely useful to us. One issue though is that it's still valid to have F90/95 etc... code in fixed form files with a .f extension, so it muddies the water some more.

I tried the --tree=prune thing with the latest SCons. SCons by Steven Knight et al.: SCons: v4.3.0.559790274f66fa55251f5754de34820a29c7327a, Tue, 16 Nov 2021 19:09:21 +0000, by bdeegan on octodog

on the problematic code and I get the following

  | +-<path-to-lib>\importLibrary.lib
  |   +-<path-obj>\module1.obj
  |   | +-<path-src>\module1.f
  |   | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  |   +-<path-obj>\module2.obj
  |   | +-<path-src>\module2.f
  |   | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  |   +-<path-obj>\fortranFile1.obj
  |   | +-<path-src>\fortranFile1.f
  |   | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE

In this instance, fortranFile1.f has both use module1 and use module2 statements within the contained functions/subroutines, yet the dependency is not showing up in the tree. However, if I look further down in the tree I see this:

  |   +-<path-obj>\fortranFile2.obj
  |   | +-<path-src>\fortranFile2.f90
  |   | +-<path-obj>\moduleA.mod
  |   | | +-<path-src>\moduleA.f90
  |   | | +-<path-obj>\moduleB.mod
  |   | | | +-<path-src>\moduleB.f90
  |   | | | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  |   | | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  |   | +-[<path-obj>\moduleB.mod]
  |   | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE

So, it seems to be getting the dependency correct for .f90 extensions but not .f extensions.

Could it be that the .mod files are not added here properly when we are passing .f files that use modules to the library target?

mwichmann commented 2 years ago

So, it seems to be getting the dependency correct for .f90 extensions but not .f extensions.

That's not impossible, since they use separate settings groups, but this stuff is convoluted enough it's hard to see. There are also slightly differing settings for lib and non-lib.

Did you say you were using a modified copy of ifort.py?

Is it possible to attach a dump of the environment variables (really only the set that start with F are interesting, I believe) - it might show the anomaly that's hitting your usage.

dnwillia-work commented 2 years ago

Yes, we have a modified version of ifort.py. Staring at the differences with your current code the diffs are:

Here's the variables:

FORTRANFLAGS = ['/nologo', '/MD', '/Qvc14', '/warn:unused', '/warn:uncalled', '/O3'] # We set this one
F90FLAGS = ['/nologo', '/MD', '/Qvc14', '/warn:unused', '/warn:uncalled', '/O3'] # We set this one
F95FLAGS = ['/nologo', '/MD', '/Qvc14', '/warn:unused', '/warn:uncalled', '/O3'] # We set this one
FORTRANPATH = ['#.']
FORTRANSUFFIXES = ['.f', '.for', '.ftn', '.F', '.FOR', '.FTN', '.fpp', '.FPP', '.f77', '.F77', '.f90', '.F90', '.f95', '.F95', '.f03', '.F03', '.f08', '.F08']
FORTRANCOM = $FORTRAN -object:$TARGET -c $FORTRANFLAGS $_FORTRANINCFLAGS $_FORTRANMODFLAG $SOURCES
FORTRANPPCOM = $FORTRAN -object:$TARGET -c $FORTRANFLAGS $CPPFLAGS $_CPPDEFFLAGS $_FORTRANINCFLAGS $_FORTRANMODFLAG $SOURCES
FORTRANMODPREFIX =
FORTRANMODSUFFIX = .mod
FORTRANMODDIR = ${TARGET.dir} # We set this one
FORTRANMODDIRPREFIX = /module:
FORTRANMODDIRSUFFIX =
F77FLAGS =
F77COM = $F77 -object:$TARGET -c $F77FLAGS $_F77INCFLAGS $SOURCES
F77PPCOM = $F77 -object:$TARGET -c $F77FLAGS $CPPFLAGS $_CPPDEFFLAGS $_F77INCFLAGS $SOURCES
F90COM = $F90 -object:$TARGET -c $F90FLAGS $_F90INCFLAGS $_FORTRANMODFLAG $SOURCES
F90PPCOM = $F90 -object:$TARGET -c $F90FLAGS $CPPFLAGS $_CPPDEFFLAGS $_F90INCFLAGS $_FORTRANMODFLAG $SOURCES
F95COM = $F95 -object:$TARGET -c $F95FLAGS $_F95INCFLAGS $_FORTRANMODFLAG $SOURCES
F95PPCOM = $F95 -object:$TARGET -c $F95FLAGS $CPPFLAGS $_CPPDEFFLAGS $_F95INCFLAGS $_FORTRANMODFLAG $SOURCES
F03FLAGS =
F03COM = $F03 -o $TARGET -c $F03FLAGS $_F03INCFLAGS $_FORTRANMODFLAG $SOURCES
F03PPCOM = $F03 -o $TARGET -c $F03FLAGS $CPPFLAGS $_CPPDEFFLAGS $_F03INCFLAGS $_FORTRANMODFLAG $SOURCES
F08FLAGS =
F08COM = $F08 -o $TARGET -c $F08FLAGS $_F08INCFLAGS $_FORTRANMODFLAG $SOURCES
F08PPCOM = $F08 -o $TARGET -c $F08FLAGS $CPPFLAGS $_CPPDEFFLAGS $_F08INCFLAGS $_FORTRANMODFLAG $SOURCES
F77 = ifort
F90 = ifort
FORTRAN = ifort
F95 = ifort
F90PATH = ${TARGET.dir}  # We set this one

Noted which ones we explicitly set.

dnwillia-work commented 2 years ago

Based on my hypothesis I tried to reproduce this with a simple example, could not do it yet....

bdbaddog commented 2 years ago

@dnwillia-work - could you provide your modified version of ifort.py ?

dnwillia-work commented 2 years ago

Yeah sure. I’ll push it up later today and share a PR.

mwichmann commented 2 years ago

At least at my end, the effort of the ifort.py tool to add those extra recognized suffixes (.i and .i90) messes things up - you get into a place where apparently the .f files aren't scanned (the .f90 ones aren't either, so this doesn't explain your issue). Are those actually used for anything? As I said... somewhere... I can't find any hint of those in current Intel docs, which aren't particularly searchable.

dnwillia-work commented 2 years ago

Regarding the file extensions there is this:

https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/compiler-setup/use-the-command-line/file-extensions.html

So, Intel does seem to accept .i and .i90 as acceptable file extensions where those files are passed to the compiler.

mwichmann commented 2 years ago

Sigh... I'd swear I had been looking at that very page, must be blind.

dnwillia-work commented 2 years ago

lol. Yeah, it's not the easiest to navigate or digest it.

Here's the ifort tool module we are using currently:

https://github.com/dnwillia-work/scons/pull/1

mwichmann commented 2 years ago

There are these comments in the wiki

dnwillia-work commented 2 years ago

Thanks for sharing that. In general, I agree with what is proposed. The stuff about the dialects is probably overly complex.

I tried again to reproduce this issue outside our regular build system using the ifort.py tool I shared and cannot. My SConstruct looks like this:

import os
localTools = os.path.join('.', 'site_scons', 'site_tools')
env = Environment(toolPath = localTools, tools = ['default', 'ifort'])
env['LINKFLAGS'] += ['/subsystem:console'] # Needed to link a Program()
env.Program('hello77', ['hello77.f', 'fixedModule.f', 'freeModule.f90'])
env.Program('hello90', ['hello90.f90', 'fixedModule.f', 'freeModule.f90'])

Source files are:

hello77.f

      program hello
      use fixedModule, only : sayHello
      use freeModule, only : sayHello90
      call sayHello
      call sayHello90
      end program hello

hello90.f90

program hello90
  use fixedModule, only : sayHello
  use freeModule, only : sayHello90
  call sayHello
  call sayHello90
end program hello90

freeModule.f90

module freeModule
  public :: sayHello90
contains
  subroutine sayHello90()
    print *,"Hello from free form module!"
  end subroutine sayHello90
end module freeModule

fixedModule.f

      module fixedModule
      public :: sayHello
      contains
      subroutine sayHello()
        print *,"Hello from fixed form module!"
      end subroutine sayHello
      end module fixedModule

I get this tree:

+-.
  +-fixedModule.f
  +-fixedmodule.mod
  | +-fixedModule.f
  | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  +-fixedModule.obj
  | +-fixedModule.f
  | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  +-freeModule.f90
  +-freemodule.mod
  | +-freeModule.f90
  | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  +-freeModule.obj
  | +-freeModule.f90
  | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  +-hello77.exe
  | +-hello77.obj
  | | +-hello77.f
  | | +-[fixedmodule.mod]
  | | +-[freemodule.mod]
  | | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  | +-[fixedModule.obj]
  | +-[freeModule.obj]
  | +-C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.20.27508\bin\HostX64\x64\link.EXE
  +-hello77.f
  +-[hello77.obj]
  +-hello90.exe
  | +-hello90.obj
  | | +-hello90.f90
  | | +-[fixedmodule.mod]
  | | +-[freemodule.mod]
  | | +-C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.5.281\windows\bin\intel64\ifort.EXE
  | +-[fixedModule.obj]
  | +-[freeModule.obj]
  | +-C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.20.27508\bin\HostX64\x64\link.EXE
  +-hello90.f90
  +-[hello90.obj]
  +-SConstruct

So, it all looks ok.

Interesting side point: I have to add /subsystem:console there for env.Program() targets to link otherwise the linker does not find the entry point.

dnwillia-work commented 1 year ago

@mwichmann I ran into this again today on a fairly simple project I'm messing with.

I did some tracing/debugging and what I found is that this warning gets generated if I pass -warn=all on the command line:

https://github.com/SCons/scons/blob/810ca6c8895b01cbd636d83079f6a848dc36adf6/SCons/Scanner/Fortran.py#L114

when a module uses another module. So, if you do this in my prior example:

      module fixedModule
      use freeModule
      public :: sayHello
      contains
      subroutine sayHello()
        print *,"Hello from fixed form module!"
      end subroutine sayHello
      end module fixedModule

then that fails because fixedModule is compiled without compiling freeModule first.

I checked that the scanner perfectly figures out that freeModule.mod is a dependency of fixedModule.f but n here is returned as none:

n, i = self.find_include(dep, source_dir, path)

and you get the warning. I find it hard to parse everything that goes on if I follow the rabbit hole down the function call to find_include but if the general assumption is that the module files will be present in the source tree like .h, .c, .f, .f90 etc... then that assumption is incorrect. The module files are a compiler output file, i.e. generated by the compiler.

mwichmann commented 1 year ago

Interesting. Is this only with ifort?

bdbaddog commented 1 year ago

@mwichmann I ran into this again today on a fairly simple project I'm messing with.

I did some tracing/debugging and what I found is that this warning gets generated if I pass -warn=all on the command line:

https://github.com/SCons/scons/blob/810ca6c8895b01cbd636d83079f6a848dc36adf6/SCons/Scanner/Fortran.py#L114

when a module uses another module. So, if you do this in my prior example:

      module fixedModule
      use freeModule
      public :: sayHello
      contains
      subroutine sayHello()
        print *,"Hello from fixed form module!"
      end subroutine sayHello
      end module fixedModule

then that fails because fixedModule is compiled without compiling freeModule first.

I checked that the scanner perfectly figures out that freeModule.mod is a dependency of fixedModule.f but n here is returned as none:

n, i = self.find_include(dep, source_dir, path)

and you get the warning. I find it hard to parse everything that goes on if I follow the rabbit hole down the function call to find_include but if the general assumption is that the module files will be present in the source tree like .h, .c, .f, .f90 etc... then that assumption is incorrect. The module files are a compiler output file, i.e. generated by the compiler

Can you paste output when run with scons --tree=all ?

mwichmann commented 1 year ago

The module files are a compiler output file, i.e. generated by the compiler.

As to the last sentence - at least on some level, the code knows that, that's what emitters do, add in files that are generated by tools.

mwichmann commented 1 year ago

Just as a note (unrelated to the issue), this stanza:

localTools = os.path.join('.', 'site_scons', 'site_tools')
env = Environment(toolPath = localTools, tools = ['default', 'ifort'])

should be unneeded - SCons already looks in site_scons/site_tools without you having to add that to the toolpath.

mwichmann commented 1 year ago

and I can run this example, assuming I got the whole thing including problems, with gfortran (and taking out the Windows-specific flag), so I'm assuming we'll be able to spot what's different in the setup evenutally.

dnwillia commented 1 year ago

@mwichmann Sorry for cluttering up this thread. I did a bunch more investigation and have reproduced a few issues, including the one in this thread which looks like it is due to Fortran code with modules not working right with variant_dir See this repo:

https://github.com/dnwillia/SConsTests

I think everything for this issue is reproduced there on a really basic example.

mwichmann commented 1 year ago

Okay. We admittedly have a lot of Fortran issues already, and not enough people using it to drive getting things fixed. Maybe this is a moment.

https://github.com/SCons/scons/labels/Fortran

2635 (and maybe a few others) look like they may be related - or not; that was just a really quick glance.

dnwillia commented 1 year ago

lol, yes. James and I were speculating last night we are the only ones.

bdbaddog commented 1 year ago

Are you running of the master development branch? Or the latest release (4.5.2)?

mwichmann commented 1 year ago

From the linked gh:

None of this code builds out of the box at all on Windows. The ifort tool does not configure properly even if you have it pre-configured in the environment.

This is no surprise. The current Intel opeAPI toolset depends heavily on a setup that fills in environment variables, which SCons proceeds to ignore in its search for purity and reproducibility. Things which are vital to actually running ifort, icc, and their next-gen equivalents need to bet set up in the tool. The msvc setup tool has the same problem, and it's grown substantially over the years are more and more stuff becomes needed. I know I've said this before, sorry if it's repeating: this kind of heavyweight (you count on a few thousand lines of shell script or batch file to have run, leaving your environment ready-to-go) isn't a great fit for the current SCons model of simple tool modules and isolation of the environment in which the builds actually run - it makes for a lot of work, and the ground keeps changing under you.

dnwillia commented 1 year ago

@bdbaddog The tests are all with 4.5.2... I've not quite figured out how to create a wheel yet from your repo. Happy to try that.

dnwillia commented 1 year ago

Matt, regarding Intel compilers I think the compromise would be to have the tool call the necessary configuration script for the Intel compiler to setup the Environment object so that code will build. I have a custom config tool that works, which I derived from a SCons provided tool years ago, but it's kind of a pain to maintain it.

bdbaddog commented 1 year ago

@bdbaddog The tests are all with 4.5.2... I've not quite figured out how to create a wheel yet from your repo. Happy to try that.

If you have a checked out git repo, you can just run via <path to your checked out scons>/scripts/scons.py <your normal args here>

mwichmann commented 1 year ago

Okay, and this line:

Building code with fortran modules does not work right with variant_dir. You need to set the location where fortran modules (.mod) get generated otherwise they get stored into the same directory as the SConstruct.

Yes, this comes up repeatedly. SConscripts are evaluated in the context of the directory they're in, but then the builds happen in the context of the SConstruct dir (simplification, but will do for now). Just had a wrestling match with flex/bison "extra files" over this. I know the past Fortran tool maintainers knew this, but that doesn't mean every combination is right. Certainly when glancing back over some of the past bugs this theme comes up several times - if there's a subdirectory involved (which there always is if there's a varintdir in use), then things can go wrong.

mwichmann commented 1 year ago

@bdbaddog The tests are all with 4.5.2... I've not quite figured out how to create a wheel yet from your repo. Happy to try that.

If you have a checked out git repo, you can just run via <path to your checked out scons>/scripts/scons.py <your normal args here>

Or, in the top directory of your git checkout of scons, and with your virtualenv activated:

pip install -e .
dnwillia commented 1 year ago

OK let me give it a shot. I did not think of that.

dnwillia commented 1 year ago

Alright, done, updated the README with some notes about the install into my environment.

All the issues persist, including this one.

Matt I think the behaviour you describe here:

Yes, this comes up repeatedly. SConscripts are evaluated in the context of the directory they're in, but then the builds happen in the context of the SConstruct dir (simplification, but will do for now). Just had a wrestling match with flex/bison "extra files" over this.

is fine. It's just that the object files produced by the compiler do not end up in that directory. The module files should really be going to the same place as the object files. There is a link in my notes to the change I made that gets them into the right spot.

mwichmann commented 1 year ago

Which, at first glance, sounds sane (if you mean env["FORTRANMODDIR"] = "${TARGET.dir}"). Waiting on @bdbaddog though!

dnwillia commented 1 year ago

Which, at first glance, sounds sane (if you mean env["FORTRANMODDIR"] = "${TARGET.dir}"). Waiting on @bdbaddog though!

Yes, that's the line I mean.