safesparrow / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
1 stars 0 forks source link

Alternative Trie approach #24

Closed nojaf closed 1 year ago

nojaf commented 1 year ago

This is an alternative take on the existing trie idea.

Some key differences:

I believe we can create some good tooling around displaying the file contents processing and the trie. This would allow the user to troubleshoot their own codebase in case we missed something.

Now the bad 😅, this PR contains a lot of hardcoded stuff and cannot be merged as is. I'd like to get a yay or nay before I completely replace the existing code and clean everything up. It did work fine for me on Fantomas.Core, Fantomas.Core.Tests, FSharp.Compiler.ComponentTests & FSharp.Compiler.Service. I'd be happy to show that during a call.

We should definitely continue to invest in https://github.com/safesparrow/fsharp/issues/22. Perhaps even in existing unit tests where possible.

safesparrow commented 1 year ago

Things to resolve:

safesparrow commented 1 year ago

FYI I ran a profiler for dependency resolution for FCS. Below it takes ~660ms, despite the test logs showing the below:

TrieMapping.mkTrie took 00:00:00.0142921
FileContentMapping.mkFileContent took 00:00:00.0002153
mkGraph took 00:00:00.0000092

So I think we need to update the logs to include all the operations.

Snapshot:

image

40% of time is spent in GC and we allocate 686MB of memory. We might want to look at a few optimizations to reduce memory allocations. Alternatively we can make GC faster in FSC by increasing heap count (we probably want to do that anyway, at least as an opt-in).

EDIT: Here is a comparison of server GC with heap count of 12 (left) vs heap count of 2 (right). Not sure why the one on the left has those long gaps during GC - maybe the profiler needs to do work at some point and blocks the app? Not sure.

image
nojaf commented 1 year ago

Hello, I believe I've addressed the open issues:

=> Only top-level [<AutoOpen>] and global namespace have this. Nested modules with [<AutoOpen>] will add the current file index to the trie (similar to types in namespaces).

=> See ghost dependency solution. (Scenario "A partial open statement still links to a file as a last resort" covers this).

I changed a couple of Seq calls to Array/List. I'm currently very confused about the timings. When profiling the unit test I think mkGraph still takes about 500 ms. I might be able to improve something on the file level.

The next thing is to integrate it with the actual type-checking code. I would like to be able to re-use the scenario's in a compilation test.

nojaf commented 1 year ago

Hi again, I finished the implementation. Some timings (from running 1. Test sequential type-checking and 3. Test graph-based type-checking):

Sequential:
Fantomas.Core: 1.865596s
FCS: 40.9183435s
FSharp.Compiler.ComponentTests: 2.9444694s

Graph:
Fantomas.Core: 1.2768386s
FCS: 21.1988973s
FSharp.Compiler.ComponentTests: 1.3280543s

I think this is good to go. I did have to remove most logs to see the improvement for Fantomas.Core.

Some highlights of these last changes:

=> I believe strictly for type-checking these aren't hard requirements but I got errors in the IlxGen code later on when not having these requirements. (See final fold experiment)

If you are on board with this PR I can strip all the hardcoded stuff and we can merge this in.

safesparrow commented 1 year ago

Thanks for all the work.

Some highlights of these last changes:

When signature files are present, that content is being used to populate the trie. However, when a match is found in a query later on I return the index of the implementation file instead. An implementation file will always have a dependency on its signature file. => I believe strictly for type-checking these aren't hard requirements but I got errors in the IlxGen code later on when not having these requirements. (See final fold experiment) I re-enabled the behaviour of the --test:ParallelCheckingWithSignatureFilesOn. When a backed implementation file is encountered it will be skipped and type-checked in a second phase. I do believe this is the sweet spot. By using a signature file you can speed up the graph and type-check the implementation in parallel later on.

This is slightly confusing to me. The existing algorithm already handles fs-fsi files, and makes sure the required dependencies exist. It doesn't need a second phase, because the .fs files are simply nodes in the graph that nothing else depends on. Having a second phase to me is unnecessary complexity, which is already captured in the graph solution. Keeping both the graph and multiple phases seems unnecessary. What do you think is the advantage of not letting the graph sort out backed .fs files?

For A.fsi, A.fs, B.fs the existing algorithm should create the following graph:

A.fsi ->
A.fs -> A.fsi
A.fsix -> A.fsi
B.fs -> A.fsix

The nodes should be processed in the following order:

A.fsi
A.fs, A.fsix (both using results from `A.fsi`)
B.fs (using results from `A.fsix, A.fsi`)

and the final fold to create final TcState would process A.fsi, A.fs, B.fs in this exact order (A.fsix is excluded from the final fold).

nojaf commented 1 year ago

Hi, I'm definitely open to discussing this during a call sometime. (I can't tell just yet when I'm available).

Consider the case of the compiler where all files are backed. What happens today is that signature files are processed sequentially and the implementation files are in parallel. With the heuristic, the sequential processing will be somewhat parallel while half of the files will still be processed in parallel to the maximum. Depending on what happens in the graph, are you not losing this benefit somewhat? We should try to compare it properly to see what it gives, but with the second phase thing, you know for sure upfront how many files will be processed in parallel.

safesparrow commented 1 year ago

I'm certain the single phase approach is strictly faster in your example. You start processing impl files before you finish processing other files, instead of waiting.

safesparrow commented 1 year ago

I can create some examples that should demonstrate the two points I made, but probably not today.

nojaf commented 1 year ago

Yes, that makes sense actually. For whatever the first phase could block the second and we don't want that. I'll revert those changes.

nojaf commented 1 year ago

So, I finished the PR. These last commits do have the previous fsix awareness without having an explicit fsix node in the graph. This is very debatable of course, whether that really improves the readability of the code. I personally like that the graph is made up of all the files on the disk.

I believe where you would filter the fsix nodes in the graph processing, I'm changing the logic of the fold function based on whether it is the final fold or not.

When it is not the final fold, I add the signature information twice to the tcState (once as the actual signature, and once as a skipped implementation file) and I don't add the implementation file result to the tcState.

When it is the final fold, everything plays out as you would expect.

safesparrow commented 1 year ago

I'm very happy to merge this.

The comments I made are minor - the old solution is far from perfect and we can easily resolve any possibly outstanding minor issues afterwards.

So as soon as you're happy to, please merge the PR 👍

Thanks again for the contribution!