safesparrow / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
1 stars 0 forks source link

Tc env signature merge repro #9

Closed nojaf closed 1 year ago

nojaf commented 1 year ago

Hello,

Let's start off with a bold statement: I don't think you can type check a signature file in parallel.

Allow me to elaborate: Inside CheckOneInputAux' we call the CheckOneSigFile function to type check the signature AST.

// Typecheck the signature file
let! tcEnv, sigFileType, createsGeneratedProvidedTypes =
    CheckOneSigFile
        (tcGlobals,
         amap,
         tcState.tcsCcu,
         checkForErrors,
         conditionalDefines,
         tcSink,
         tcConfig.internalTestSpanStackReferring)
        tcState.tcsTcSigEnv
        file

Inside this function, a bunch of stuff is being added to the TcEnv (tcState.tcsTcSigEnv to be specific).

The callback function of CheckOneInputAux' will update this TcEnv:

  printfn $"Finished Processing Sig {file.FileName}"
  return fun tcState ->
      printfn $"Applying Sig {file.FileName}"
      let fsiPartialResult, tcState =
          let rootSigs = Zmap.add qualNameOfFile sigFileType tcState.tcsRootSigs

          // Add the signature to the signature env (unless it had an explicit signature)
          let ccuSigForFile = CombineCcuContentFragments [ sigFileType; tcState.tcsCcuSig ]

          let partialResult = tcEnv, EmptyTopAttrs, None, ccuSigForFile

          let tcState =
              { tcState with
                  tcsTcSigEnv = tcEnv
                  tcsTcImplEnv = tcState.tcsTcImplEnv
                  tcsRootSigs = rootSigs
                  tcsCreatesGeneratedProvidedTypes = tcState.tcsCreatesGeneratedProvidedTypes || createsGeneratedProvidedTypes
              }
          partialResult, tcState

      fsiPartialResult, tcState

Notice that tcsTcSigEnv = tcEnv will completely overwrite the TcEnv for the current signature file. This is problematic when you have signature files that don't depend on each other. What one file added to the TcEnv can get overridden by another.

Consider my sample project:

B depends on A and C has no relations and so A and C will be type-checked in parallel. That callback of A will update the tcState.tcsTcSigEnv with the tcEnv result of type checking A.fsi. The callback of C will do the same thing accordingly.

When D.fsi is being processed, the callbacks will be invoked correctly. But as C.fsi overrides the information A.fsi, B.fsi added, A.fsi will no longer be known in the TcEnv. This will be problematic when D.fsi is type checked.

The tcState, however, will have the correct rootSigs added. So looking merely at the state, at a glance you would think that everything happened correctly.

This problem does not occur in the current type check impl/sig pairs in parallel. Because the signature files are still processed in order, leading to correct TcEnv and later the Choice2Of2 implementation files are processed in parallel.

So, now what? Short term I think it is worth trying to type-check the signature file as part of the callback. This is not ideal, but the only other option would be to merge the TcEnv of signature files. Which are a bunch of nested collections and I'm not sure how easy that will be.

To see all this with your own eyes, you would need to run the ParallelTypeCheckingTests project in "graph" mode using the args file of the sample. (Yup, hard coded absolute paths in that args.txt file).

Put a breakpoint in:

let tcState =
{ tcState with
    tcsTcSigEnv = tcEnv
    tcsTcImplEnv = tcState.tcsTcImplEnv
    tcsRootSigs = rootSigs
    tcsCreatesGeneratedProvidedTypes = tcState.tcsCreatesGeneratedProvidedTypes || createsGeneratedProvidedTypes
}

Comparing tcsTcSigEnv with tcEnv: image

In the NameEnv.eModulesAndNamespaces you can spot that one will have A and the other will have B.

I've added some code to really crash the application when a problem occurs.

nojaf commented 1 year ago

Resolve by #11