microsoft / vscode-cmake-tools

CMake integration in Visual Studio Code
https://marketplace.visualstudio.com/items?itemName=vector-of-bool.cmake-tools
MIT License
1.48k stars 454 forks source link

Improve performance of caching #4105

Open JVApen opened 1 month ago

JVApen commented 1 month ago

Brief Issue Summary

As soon as CMake Tools is done with configuring the project, it crashes the extension host. The same happens if I use an already-configured configuration, where it only needs to read the cache. I haven't been able to get much details, so any hints on how to provide extra information is welcome.

CMake Tools Diagnostics

{
  "os": "win32",
  "vscodeVersion": "1.93.1",
  "cmtVersion": "1.19.52",
  "configurations": [
    {
      "folder": "e:\\Mega\\main",
      "cmakeVersion": "3.30.2",
      "configured": false,
      "generator": "Ninja",
      "usesPresets": true,
      "compilers": {}
    }
  ],
  "cpptoolsIntegration": {
    "isReady": false,
    "hasCodeModel": false,
    "activeBuildType": "",
    "buildTypesSeen": [],
    "requests": [],
    "responses": [],
    "partialMatches": [],
    "targetCount": 0,
    "executablesCount": 0,
    "librariesCount": 0,
    "targets": []
  },
  "settings": [
    {
      "communicationMode": "automatic",
      "useCMakePresets": "always",
      "configureOnOpen": true
    }
  ]
}

### Debug Log

```shell
I used trace instead of debug as log level

[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] Checking for new versions of NuGet packages with floating version...
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] Temporary project with all NuGet packages was not modified.
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] 
[cmake] Restoring .\build\nuget\nuget-dependencies.csproj
[kit] Not reading non-existent kits file: e:\Mega\main\.vscode\cmake-kits.json
[rollbar] Invoking async function [] with Rollbar wrapping [Reloading variants file e:\Mega\main\compile_flags.txt]
[variant] Invalid variants specified:
[variant]  >> : should be object
[variant] Loaded default variants
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] NuGet restore is not required.
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] -- Configuring done (29.1s)
[extension] [4667] cmake.logDiagnostics started
[extension] [4667] cmake.logDiagnostics finished (returned undefined)
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] -- Generating done (67.5s)
[rollbar] Invoking function [$] with Rollbar wrapping [$Processing "data" event from proc stdout]
[cmake] -- Build files have been written to: E:/Mega/main/.out/asan-clangcl
[rollbar] Invoking function [$] with Rollbar wrapping [$Resolving process on "close" event]
[cmakefileapi-driver] 
[cmakefileapi-driver] Checking for new versions of NuGet packages with floating version...
Temporary project with all NuGet packages was not modified.

Restoring .\build\nuget\nuget-dependencies.csproj
NuGet restore is not required.
-- Configuring done (29.1s)
-- Generating done (67.5s)
-- Build files have been written to: E:/Mega/main/.out/asan-clangcl

[cmakefileapi-parser] Read reply folder: E:\Mega\main\.out\asan-clangcl\.cmake\api\v1\reply
[cmakefileapi-parser] Found index files: ["cache-v2-df5af9cfa1bf6faef9ac.json","cmakeFiles-v1-b69ee9cd2ad46d595f75.json" ...
[rollbar] Invoking async function [] with Rollbar wrapping [Update code model for cpptools]
[extension] Not updating the configuration provider because "C_Cpp.intelliSenseEngine" is set to "Disabled"


### Additional Information

_No response_
gcampbell-msft commented 1 month ago

@JVApen What makes you suspect that the CMake Tools extension is what crashes the extension host? I ask because the thing that makes you suspicious of the extension may be helpful in understanding the issue and investigating.

JVApen commented 1 month ago

Multiple reasons. It is nicely timed with the configuration that is finished. It shows some message about the extension crashing/exception (I haven't been able to screenshot in order to read it) at the location it indicates the configuration is ongoing. When I disable the extension or cancel the configuration, the crash doesn't happen. I can't exclude another extension reacting on something that the cmake tools initiates. Though even with the log level trace enabled in VS Code, it doesn't even mention why it crashed.

gcampbell-msft commented 1 month ago

@JVApen Interesting, definitely sounds suspicious, could you attempt to send us a repro project, as well as a recording (a .gif should be fine) of the issue? I wonder if you can get a screen recording maybe we can see the error message

JVApen commented 1 month ago

That is going to be challenging as it is linked to a large proprietary project, so I'll need to manually reduce it. That's going to take some time

JVApen commented 1 month ago

Together with a colleague, we've found what was happening. We basically encountered https://github.com/microsoft/vscode-cmake-tools/issues/970 Though there is nothing in VS Code that even gave the hint of an out of memory.

It seems that when loading the compile_commands.json of only 375MB, the process starts using 4.5+GB, after which it hits a limit and disappears. I find a +10x memory usage quite excessively as memory usage for reading in the file that is only selectively used later on. If I read the code correctly, it makes a full in-memory copy of content of the file to store in a map. As optimization, it could only store the filename and the offset in the file, and extract the compile command the moment the data is actually needed. (At least, that's what I would do in C++, I don't know if typescript is powerful enough to do the same)

gcampbell-msft commented 1 month ago

@JVApen Thanks for the investigation! So, you're saying that you think the issue is that we're storing the entire file? Or is it because of how long it takes to read the file?

JVApen commented 1 month ago

I believe it is because you are storing way too much. If I look at the class CompilationDatabase, it looks to me like it reads the whole file and stores all the info in a map, such that it is available for later use

gcampbell-msft commented 1 month ago

@JVApen Ah, got it. And that's where your suggestion comes in, you're suggesting that we instead, we opt for only reading the file when we actually need to access the file? I'm not 100% familiar with how to access the file location by offset, since it's JSON and I'm not sure whether or not typescript allows for that low level of manipulation.

This presents an interesting problem, because I'm sure the reason that it was stored to begin with is because of the perf hit that we would get if we read the file in every time we want to access something from it

JVApen commented 1 month ago

Sounds indeed like an interesting challenge. Did you parse the whole file when you did so?

What I'm suggesting is:

You can also take the whole file in memory and store string views to it. Some quick Googling gave me https://github.com/epoberezkin/json-source-map which seems to have do something like this.

Regardless, my expectation for a 400MB file would be at most 100MB on caching structure around it.