fpco / inline-c

284 stars 51 forks source link

Track included files via TH's addDependentFile #153

Open MathiasSven opened 7 months ago

MathiasSven commented 7 months ago

Closes #152


Previously, files included via C.include would not be tracked by GHC, so changes made to say test.h would not recompile a Foo.hs with C.include "test.h". If working with cabal, tracked files must be placed in cabal's top level extra-source-files stanza in order for this to be useful, more context given in the relevant issue.

Possible outstanding issues:

The solution in this PR only applies to files that are referenced relative to the current file. That is, if one were to have this setup:

package
 ├──src
 │   └──Main.hs
 ├──cbits
 │   └──test.h
 └──test.cabal

With a cabal file containing: ghc-options: -threaded -I./cbits

While the usage of C.include "test.h" would still work, it would no longer be tracked. I can't think of any solution to this other than trying to access the GHC flags from within TH, which I have seen done, but seemed very hacky, using unsafeCoerce to get a TcM from a Q.

Perhaps there is another principle approach that would make use of how the C preprocessor itself looks for files. As it stands, this PR also only applies to includes of the form "..." ignoring <...>.

roberth commented 7 months ago

trying to access the GHC flags from within TH

Yeah, that's a red flag. IIRC we're already using a GHC api that's specifically about C/C++, so perhaps GHC itself is a better place to implement this. That way it works for all TH that generates such code. Having access to the flags, it could leverage the compiler flags that makefiles use for this purpose. That way it works for all compiler flags, as well as for transitive includes.

I'd be ok with merging a workaround like this, if it has a comment referencing a GHC issue that describes the problem.

junjihashimoto commented 7 months ago

@MathiasSven Thank you for creating this PR. Could you add the comments with a GHC issue and a outstanding issue? Normally we use include-dirs directive of cabal, so that issue of ' -I./cbits' of ghc-options isn't a problem, right?

MathiasSven commented 7 months ago

It would still be a problem, this PR only addresses includes that are relative to the current file, so if you had a include-dirs, you could now reference files relative to the included directories, but they would no longer be tracked, because TH doesn't know what the included directories are.

About the GHC issue, are you suggesting I open a GHC issue or is there one already on this topic? I am likely wrong on this, I didn't take much time to understand this library's internals, but doesn't it work by constructing an intermediate module that then contains all the foreign import calls? I don't think it is using any GHC API, (GHC is also a hidden module, and it doesn't look to be in the build-depends of inline-c). What would such a GHC feature look like? I always thought that GHC didn't do much with the C/C++ code other than sending it to the C-preprocessor and CC/CXX compilers, meaning what is or isn't in is search path, or what the transitive includes are, is opaque to GHC, it only links against the object files created.

junjihashimoto commented 7 months ago

Build dependencies are generally an issue that build tools should look at. This time it's cabal. It's hard to say that it's a ghc issue. ghc will basically only check haskell code. Header files of c++ are not compiled and ghc need to look inside c++ codes to determine their dependencies, but that is beyond ghc's role. If it is a local file, check the dependencies. If it is a shared library, it may not check dependencies. The behavior may confuse users. If we run into this issue, it may make sense to use the -fforce-recomp option in general.

junjihashimoto commented 7 months ago

So, is it better to use this function only when users want to use it?

MathiasSven commented 7 months ago

Build dependencies are generally an issue that build tools should look at. This time it's cabal. It's hard to say that it's a ghc issue. ghc will basically only check haskell code. Header files of c++ are not compiled and ghc need to look inside c++ codes to determine their dependencies, but that is beyond ghc's role.

I agree that this is beyond GHC's role, for that matter, I also think it might be beyond what one would expect from Cabal.

Both Cabal and GHC don't do much with C/C++ files, they mostly relay information to the respective compilers. However, I think there could be some middle ground. Projects with more complex build steps may opt for more granular build systems, but given this library (as I see it) aims to simplify the process of FFI for the Haskell programmer, I think it would be fine to place some build responsibilities on it. include-c already introduces its own concept of dependency via TH, it just can't fully translate it to Cabal/GHC, in order for it to be able to do so, it needs a little bit more out of GHC.

If GHC exposed to TH more information about how it is being called, and specifically, how it calls the C/C++ compiler, we could both check all included directories, and also call $(CC) -M foo.h to get all transitive dependencies as Robert alluded to. This way, no extra burden would be placed on GHC.

If it is a local file, check the dependencies. If it is a shared library, it may not check dependencies. The behavior may confuse users. If we run into this issue, it may make sense to use the -fforce-recomp option in general.

I could add a section to the documentation as part of the PR to clarify when a file would be tracked or not.

If it is a shared library, it may not check dependencies.

Unless you were working on both a shared library and Haskell simultaneously this issue should not really manifest itself, the main issue I think is if you are using cbits or some other folder that TH doesn't know ahead of time, that I think is quite common. I would still leave this hotfix for those that can use it, but I understand not wanting it, so the behaviour is consistent.

roberth commented 7 months ago

So IIUC Cabal never tracks header dependencies? My use of cxx-sources is very limited, so I might have got lucky.

If some of the interactions are not supported, that's indicative of an architectural problem, and/or a bug that should be fixed.

Nonetheless, I am ok with taking a pragmatic approach in this library, as Mathias suggests. If we don't solve the whole problem (e.g. cxx-sources recompilation), it would be nice to warn our users through our docs.

MathiasSven commented 7 months ago

So IIUC Cabal never tracks header dependencies?

I don't believe so, it will track the files you explicitly put in cxx-sources, but changes to header files included in those files will not trigger a rebuild of the file.

Going back to what can be done to fully fix this issue (at least without waiting on GHC/Cabal to implement a standardized of way of dealing with this), as I mentioned before there already is a "hack" that would give us all the information we needed:

{-# LANGUAGE TemplateHaskell #-}

module Main where

import Language.Haskell.TH.Syntax
import GHC.Tc.Types
import GHC.Driver.Flags
import GHC.Settings
import GHC.Driver.Session
import Unsafe.Coerce

info = $( let runTcM :: TcM a -> Q a
              runTcM action = Q (unsafeCoerce action)
          in
            do 
              dflags <- runTcM getDynFlags
              root   <- getPackageRoot 
              let ts = toolSettings dflags

              lift
                [ show $ includePaths dflags
                , root
                , toolSettings_pgm_cxx ts
                , toolSettings_pgm_c ts
                , fst $ toolSettings_pgm_P ts
                ]
        )

main :: IO ()
main = putStrLn $ unlines info

ghc -Ifoo -Ibar -fforce-recomp Main.hs && ./Main

IncludeSpecs {includePathsQuote = [], includePathsGlobal = ["foo","bar"], includePathsQuoteImplicit = ["."]}
.
/nix/store/sfgnb6rr428bssyrs54d6d0vv2avi95c-gcc-wrapper-12.3.0/bin/c++
/nix/store/sfgnb6rr428bssyrs54d6d0vv2avi95c-gcc-wrapper-12.3.0/bin/cc
/nix/store/sfgnb6rr428bssyrs54d6d0vv2avi95c-gcc-wrapper-12.3.0/bin/cc

@roberth Would your suggestion be to raise an issue on GHC about giving a non unsafeCoerce interface to this information, and in the meanwhile add a section to the documentation explaining when files are or aren't tracked via include-c?

edit: I opened an issue at haskell/core-libraries-committee proposing a stable API to the above. Discussion relevant to this issue has moved to the GHC Issue tracker at issue 24353.