microsoft / vscode-cpptools

Official repository for the Microsoft C/C++ extension for VS Code.
Other
5.46k stars 1.53k forks source link

cpptools takes hours of CPU time per day for a project which compiles in 5 seconds #5574

Open MSoegtropIMC opened 4 years ago

MSoegtropIMC commented 4 years ago

Type: LanguageService

Describe the bug

For every C++ project, even tiny ones which compile in a few seconds, a cpptools process runs for something between 6 and 40 minutes every time I open the project and also from time to time after editing or git operations. Over the day this accumulates to many hours and estimated about 25% of the time my laptop fan runs cause of cpptools (it hardly every runs cause of compiles). Also the memory consumption of cpptools is fairly high - around 7GB is common. I guess this has to do with scanning system headers, but is still seems wasteful to spend 30 minutes on symbol analysis for a project where a release build takes 5 seconds. This is not really a bug, but to me it looks like unnecessary waste of resources.

One more observation: when I delete the folder in which VSCode keeps the database (on Mac ~/Library/Application Support/VSCodium/User/workspaceStorage/) it runs the first time "only" for 6 minutes - Intellisense is fully operational. The second time it decides to run for the same projects - about 1 hour later - it runs for 29 minutes.

Steps to reproduce

Expected behavior

In case you cannot reproduce this (for me this happens for every C++ project every time since at least 6 months) I can of cause provide logs, a sample project and gprof statistics.

sean-mcmanus commented 4 years ago

Yeah, that is not normal behavior. What message does it say when you hover over the database icon in the status bar? Can you set C_Cpp.loggingLevel to "Debug" and check the C/C++ logging to see what it says in regards to "tag parsing file" messages? If you don't care about files in those locations you could add those locations to your files.exclude setting.

MSoegtropIMC commented 4 years ago

@sean-mcmanus : when I hover over the database icon it shows "Discovering Files" with a file count. The file count started at 496,783, then quickly went to 0, increased again in steps of about 100,000 to e.g. 553,383, then went to zero again, started to increase again. One cycle takes about 10..20 seconds. It seems to cycle through that until something makes it happy or it gives up. I guess what I want is that it goes through this cycle just once. I will inspect the log as soon as it is finished (it seems to be more on the 30 minutes side).

Btw.: I don't think that the OSX system headers are 500,000 files. A find /usr | wc says there are 19,362 and a find /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks | wc says there are 9,322 files, so in total less than 30,000. I have these settings:

{
    "configurations": [
        {
            "name": "Mac",
            "includePath": [
                "${workspaceFolder}/"
            ],
            "defines": ["_DEBUG"],
            "macFrameworkPath": [
                "/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks"
            ],
            "compilerPath": "/usr/bin/clang",
            "cStandard": "c99",
            "cppStandard": "c++11",
            "intelliSenseMode": "clang-x86",
            "compilerArgs": [

            ]
        }
    ],
    "version": 4
}

My workspace folder has about 60 C files in it, but I have links to a few folders which do have a large number of files in my workspace (no filesystem symlinks, I mean references to other folders out of my workspace folder in my VSCode project). But all in all this is also not more than 100,000 files, and very little C (find + wc says 66 C header files and 55 C source files).

Btw.: I have a i9 CPU and 32 GB RAM and otherwise my PC is fairly unloaded, both memory and CPU wise.

Still waiting for cpptools to finish so that I can inspect the logs ...

MSoegtropIMC commented 4 years ago

P.S.: It looks like the cycles get slower each time. It is now at about 1 minute per cycle. I hope the database ops don't kill my SSD ... Aah it just finished: 30m45s - I will have a look at the logs.

MSoegtropIMC commented 4 years ago

OK, the logs say it went 240 times through these folders:

  Processing folder (recursive): /usr/local/include/
  Processing folder (recursive): /Library/Developer/CommandLineTools/usr/lib/clang/11.0.3/include/
  Processing folder (recursive): /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/
  Processing folder (recursive): /Library/Developer/CommandLineTools/usr/include/
  Processing folder (recursive): /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks/<200 subfolders>

Under the last of the above folders it scans about 200 subfolders. There are no error messages in the log while it prcoesses folders. It just starts over with /usr/local/include/ when it finished. The overall log file looks like this:

16 loops over the below block:

File exclude: **/.git
File exclude: **/.svn
File exclude: **/.hg
File exclude: **/CVS
File exclude: **/.DS_Store
File exclude: **/.vscode
Search exclude: **/node_modules
Search exclude: **/bower_components
Search exclude: **/*.code-search
Search exclude: **/.vscode
IntelliSense Engine = Default.
Enhanced Colorization is enabled.
Error squiggles are enabled if all header dependencies are resolved.
Autocomplete is enabled.

(the last 4 lines appear only in the first 15 loops, but not in the 16ths loop)

15 Loops over the below block:

Populate include completion cache.
Discovering files...
  Processing folder (recursive): /usr/local/include/
  :
  16 loops (in total 240=15*16) through the same set of a bit more than 200 folders, in total > 50,000 Processing folder lines
  No other messages than "Processing folders" in this block - when it finished it just starts over
  :
  Discovering files: 569698 file(s) processed
  0 file(s) removed from database
Done discovering files.
Parsing open files...
Parsing remaining files...
  Parsing: 0 files(s) processed
Done parsing remaining files.
Done parsing open files.

I didn't check very carefully, but I think the inner loop is always exactly 16. I searched for Discovering files and the distance in lines is about equal.

One more note: cpptools has 30 threads - as far as I can tell for a period of 3 seconds always only two of these threads are active and the others are sleeping. Maybe instead of distributing the work over 15 pairs of threads, each thread pair does the complete job - one after another (the kind of parallelisation some humans enjoy but computers usually not ;-)

In summary, if it would go through the folders just once rather than 240 times, it would take just 8 seconds - which would be will within my expectations.

MSoegtropIMC commented 4 years ago

P.S.: If this is of interest, I can attach the log (with a few private lines xed out)

sean-mcmanus commented 4 years ago

The logging "File exclude: **/.git" should not appear with 0.28.1 unless the cpptools process is crashing, in which case you would see that once per crash. It sounds like you see that 16 times? There's code that checks for 5 crashes in 3 minutes and if that is hit then it won't keep restarting our cpptools process, but if you're hitting the 16 crashes in 9+ minutes then that would explain it.

Can you attach a debugger to the cpptools process to get a call stack for why it's crashing. Or check the ~/Library/Logs/DiagnosticReports for a crash log?

The discovering files is only supposed to occur 1 time and not process the same folders repeatedly...unless some settings/config change causes it to abort and start over.

You're hitting a problem with our code that recurses through the file system to look for files that it may need to parse later, so it's not actually doing any parsing (it says Parsing: 0 files processed).

I hit an issue on Windows with a symlink causing huge performance problems at https://github.com/microsoft/vscode-cpptools/issues/3123 . Do you know if you have any symlinks that could cause a similar failure? I don't think we have gotten other complaints from Mac users like this yet.

I'm using XCode and not the command line tools and I also don't have /usr/local/include in my includePath (not sure if that would matter). Other than that, the 9k files is what it should be processing, not 569k.

nsoblath commented 4 years ago

In case it can be helpful, I believe I'm seeing the same problem. I see the same repeated blocks in the log output, both the part with the File exclude lines and the part that starts Populate include completion cache..

Here's my system information:

I don't think I have any symlinks as were discussed in #3123.

I don't see anything relevant in ~/Library/Logs/DiagnosticReports.

I don't know if this is relevant, but in the Activity Monitor I see that there are two instances of cpptools (and one cpptools-srv. One has a CPU time of 49.06 seconds, and the other has a time of 26:42.31. Presumably the second is the instance that is doing all of the work when it's doing all of this file discovery.

ghost commented 4 years ago

I have noticed this on my CentOS machine as well. I use this in conjunction with the Remote Development extension. cpptools consistently uses over 80% CPU.

image

I plan to look into this in detail over the coming weekend. Let me know if you guys have pointers on where I should be looking at.

MSoegtropIMC commented 4 years ago

@MohammadGhazanfar : I didn't have time to look deeper into it, but also planned to do so maybe tomorrow (a bank holiday in Germany) or during the weekend. It would be nice to stay in contact on this. I post here as soon as I have something. I will start with doing some profile statistics (easy on OSX) which helps in setting break points for a first analysis.

ghost commented 4 years ago

Thanks @MSoegtropIMC ! That will be great !

MSoegtropIMC commented 4 years ago

@sean-mcmanus : I tried to debug cpptools. I followed the instructions in (https://github.com/microsoft/vscode-cpptools/blob/master/Documentation/Building%20the%20Extension.md) and I can start VSCode in extension development mode and break and step the typescript code, but what is not clear to me is how to debug or even build the cpptools binary which is causing the actual problem. As far as I can tell the binary is downloaded from microsoft following information from the package.json file

    {
      "description": "C/C++ language components (OS X)",
      "url": "https://go.microsoft.com/fwlink/?linkid=2131173",
      "platforms": [
        "darwin"
      ],
      "binaries": [
        "./bin/cpptools",
        "./bin/cpptools-srv"
      ]
    },

It doesn't look like the build instructions given for the extension actually build this. Can you point me to the source / build instruction of the cpptools binary and maybe some hints for debugging it? This would really be helpful for getting forward.

sean-mcmanus commented 4 years ago

@MSoegtropIMC You don't need to debug the TypeScript part of the extension. Just attach a debugger to the cpptools process to and you should be able to get call stacks (i.e. set the path to cpptools in the tasks.json for gdb attach debugging). The cpptools process is closed source.

nsoblath commented 4 years ago

Here are two call stacks for cpptools taken with macOS's Activity Monitor while running VSCode on two occasions recently. In both cases cpptools had been running for a long period of time at about 100% CPU usage.

Sample of cpptools 2.txt Sample of cpptools.txt

Colengms commented 4 years ago

@nsoblath The stacks you posted appear to both be in the process of tag parsing files. You mentioned that you don't think you have any symlinks. Can you confirm?

If you enable debug logging using the setting "C_Cpp.loggingLevel": "Debug", paths to each tag parsed file with be displayed. If there is a symlink, it may be apparent in that output.

sean-mcmanus commented 4 years ago

Actually, it's not doing any tag parsing work yet, just enumerating files, so no "tag parsing" message should appear in the logging. That could occur if a very large workspace folder were opened (or with slow file access) or if it got stuck in some symlink loop. If you temporarily set the C_Cpp.loggingLevel to the hidden value of "9" you should be able to see more logging info about what paths are being processed ("Debug" level is probably not sufficient). See https://github.com/microsoft/vscode-cpptools/issues/3123 .

MSoegtropIMC commented 4 years ago

@sean-mcmanus : since cpp-tools is closed source, would it help to setup this up as a CI test on an Azure OSX? I see that you have OSX CI active for this project, so I guess if I would create a pull request which modifies the OSX CI test so that it runs for a long time, this would help you to get this fixed?

Do you have XCode installed on the OSX azure machines (for the system headers)? I guess a cpptools binary should be available in CI, so I just need to create some sort of fake project and call cpptools with proper command line.

MSoegtropIMC commented 4 years ago

@sean-mcmanus : I think I got a bit closer to the mystery. A few observations:

I have a vscode workspace with 16 folders in it - only one of which contains C code. cpptools indexes the OSX / XCode haeaders once for each of these 16 folders - even those which neither have C code nor C build instructions in it.

In the log (set to level 9) the block for the folder which has C code starts with:

cpptools/didChangeCppProperties
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
terminating child process: 17589
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
terminating child process: 17591
  Folder: /usr/local/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/usr/lib/clang/11.0.3/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/usr/include/ will be indexed
  Folder: /Users/<myhome>/<myCprojectfolders>/ will be indexed
then come all the individual SDK subfolders (about 200)

Those workspace folders which do not contain C code start with:

cpptools/didChangeCppProperties
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
terminating child process: 17651
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
terminating child process: 17653
  /Users/<myhome>/<oneofmyotherprojectfolders>/** is not a directory
  Folder: /usr/local/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/usr/lib/clang/11.0.3/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/ will be indexed
  Folder: /Library/Developer/CommandLineTools/usr/include/ will be indexed
  Folder: /Users/<myhome>/<oneofmyotherprojectfolders>/ will be indexed
then come all the individual SDK subfolders (about 200)

Please note that /Users/<myhome>/<oneofmyprojectfolders>/ has subfolders, but no C files. One of my non-C workspace folders has about 80.000 files in it, but as far as I can tell it goes quickyl through it. What seems to take time is scaiing the system headers once for each workspace folder.

I guess the database doesn't get faster by putting the system header symbol information 16 times into it.

So I guess what is needed is a way to tell cpptools which project folders to handle and which to ignore.

MSoegtropIMC commented 4 years ago

OK, the logs say it went 240 times through these folders:

P.S.: the above 16 loops took 2 minutes. As it looks (see the post I cited above) it sometimes goes through the workspace folders square - something like it creates a thread for each workspace folder and then each of these threads goes through all workspace folders. Then it takes about 40 minutes.

nsoblath commented 4 years ago

@Colengms @sean-mcmanus I confirm that I have no symlinks in my project directories. I recursively checked through all of those directories, and there were no symlinks to directories.

It's entirely possible that there are symlinks in the directories that are being searched for system headers and other dependencies, because I have many packages installed with homebrew, which seems to install everything with symlinks.

I took a look at what I could see with log level 9. After restarting VSCode, cpptools started rerunning, and eating up a lot of memory, much more than when on lower log levels (which I confirmed by going back to Debug level and rerunning). I copied the log output (18 MB of text) and sampled the process before quitting VSCode due to high memory usage. The log file is filled with lines that look like this:

cpptools/didChangeCppProperties
path_utf8::split_path(): /Users/<USER>/Software/Project8/cicada
  Found part: 
  Found part: Users
  Found part: <USER>
  Found part: Software
  Found part: Project8
  Found part: cicada
  folder: /Users/<USER>/Software/Project8
  file name: cicada
  file extension: 
path_utf8::split_path(): /Users/<USER>/Software/Project8/cicada
  Found part: 
  Found part: Users
  Found part: <USER>
  Found part: Software
  Found part: Project8
  Found part: cicada
  folder: /Users/<USER>/Software/Project8
  file name: cicada
  file extension: 
path_utf8::split_path(): /Users/<USER>/Software/Project8/cicada
  Found part: 
  Found part: Users
  Found part: <USER>
  Found part: Software
  Found part: Project8
  Found part: cicada
  folder: /Users/<USER>/Software/Project8
  file name: cicada
  file extension: 
path_utf8::split_path(): /Users/<USER>/Software/Project8/cicada
  Found part: 
  Found part: Users
  Found part: <USER>
  Found part: Software
  Found part: Project8
  Found part: cicada
  folder: /Users/<USER>/Software/Project8
  file name: cicada
  file extension: 
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
fork complete (parent process, child pid = 90454)
terminating child process: 90454
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
fork complete (parent process, child pid = 90465)
terminating child process: 90465
path_utf8::split_path(): /Users/<USER>/Software/Project8/cicada
  Found part: 
  Found part: Users
  Found part: <USER>
  Found part: Software
  Found part: Project8
  Found part: cicada
  folder: /Users/<USER>/Software/Project8
  file name: cicada
  file extension: 
path_utf8::split_path(): /usr/local/include
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  folder: /usr/local
  file name: include
  file extension: 
directory_utf8_entry constructor: /usr/local/include/pstoedit (directory)
path_utf8::split_path(): /usr/local/include/pstoedit
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  Found part: pstoedit
  folder: /usr/local/include
  file name: pstoedit
  file extension: 
  _directory_name: /usr/local/include/pstoedit
  _file_name: 
directory_utf8_entry constructor: /usr/local/include/wx-3.0 (directory)
path_utf8::split_path(): /usr/local/include/wx-3.0
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  Found part: wx-3.0
  folder: /usr/local/include
  file name: wx-3
  file extension: .0
  _directory_name: /usr/local/include/wx-3.0
  _file_name: 
directory_utf8_entry constructor: /usr/local/include/lzma.h (file)
path_utf8::split_path(): /usr/local/include/lzma.h
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  Found part: lzma.h
  folder: /usr/local/include
  file name: lzma
  file extension: .h
parent_path: /usr/local/include/lzma.h
path_utf8 constructor from parts: combined = /usr/local/include
path_utf8::split_path(): /usr/local/include
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  folder: /usr/local
  file name: include
  file extension: 
path_utf8::split_path(): lzma.h
  Found part: lzma.h
  folder: 
  file name: lzma
  file extension: .h
  _directory_name: /usr/local/include
  _file_name: lzma.h
directory_utf8_entry constructor: /usr/local/include/QtWebSockets (directory)
path_utf8::split_path(): /usr/local/include/QtWebSockets
  Found part: 
  Found part: usr
  Found part: local
  Found part: include
  Found part: QtWebSockets
  folder: /usr/local/include
  file name: QtWebSockets
  file extension: 
  _directory_name: /usr/local/include/QtWebSockets
  _file_name: 
bobbrow commented 4 years ago

Yeah, log level 9 is too noisy even for us. I don't recommend going above 7. For internal development, we can promote some of the level 8 and 9 messages we care about to a lower number temporarily, but we don't release the extension that way.

sean-mcmanus commented 4 years ago

@sean-mcmanus : since cpp-tools is closed source, would it help to setup this up as a CI test on an Azure OSX? I see that you have OSX CI active for this project, so I guess if I would create a pull request which modifies the OSX CI test so that it runs for a long time, this would help you to get this fixed?

Do you have XCode installed on the OSX azure machines (for the system headers)? I guess a cpptools binary should be available in CI, so I just need to create some sort of fake project and call cpptools with proper command line.

The TypeScript CI test just tests some of our TypeScript functionality (we have other internal tests that test cpptools). If you can modify our TypeScript tests to get a repro that sounds like it may help. I'm not sure if XCode is installed (we don't use those machines to build stuff). cpptools doesn't use the command line to communicate -- it uses JSON messages, so you should probably send messages via TypeScript.

sean-mcmanus commented 4 years ago

Yeah, log level 9 is too noisy even for us. I don't recommend going above 7. For internal development, we can promote some of the level 8 and 9 messages we care about to a lower number temporarily, but we don't release the extension that way.

I mentioned 9 because that is the level at which we log directory iteration and exclusion info (7 is insufficient).

bobbrow commented 4 years ago

Ok. I figured 7 would be enough since it logs the decision around whether a file is parsed or not (e.g. the timestamp check). If a file doesn't show up there, then it was excluded.

sean-mcmanus commented 4 years ago

@MSoegtropIMC The issue you are describing appears to be https://github.com/microsoft/vscode-cpptools/issues/5156 , not sure if there are additional issues as well. We would have to add code to avoid unnecessary duplicate directory iteration of the system paths -- the reason for processing the system headers per workspace is that certain folders may have settings that cause different results, e.g. files.exclude, limitSymbolsToIncludedHeaders, etc.

sean-mcmanus commented 4 years ago

@nsoblath The log isn't filled with cpptools/didChangeCppProperties and "Attempting to get defaults from compiler in "compilerPath" property:", right? You mean it's filled with path/directory_ut8 logging, right?

Yeah, my guess would be your system headers paths are triggering https://github.com/microsoft/vscode-cpptools/issues/3123 -- you should be able to confirm that via setting your compilerPath to "" and see if the problem is fixed.

sean-mcmanus commented 4 years ago

@MSoegtropIMC You should be able to avoid processing in the non-C/C++ folders via setting C_Cpp.default.browse.path to [] and C_Cpp.default.compilerPath to "" in your workspace settings, and then just overriding those in the folders have have C/C++ files.

MSoegtropIMC commented 4 years ago

@sean-mcmanus : I put these lines into my workspace file and didn't override this in any of the subfolders:

        "C_Cpp.loggingLevel": "7",
        "C_Cpp.default.browse.path": [],
        "C_Cpp.default.compilerPath": ""

The effect is that it doesn't scan my user folders but it still scans the system headers 16 times. I guess I need to set the Mac framework path as well to empty? Setting the compiler path to empty produces these warnings:

cpptools/didChangeCppProperties
Compiler in "compilerPath" property not found: 
Compiler in "compilerPath" property not found: 
  Folder: /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks/MetricKit.framework/Versions/A/Headers/ will be indexed
:

but it still happily indexed the files.

MSoegtropIMC commented 4 years ago

@sean-mcmanus : Setting in addition

"C_Cpp.default.macFrameworkPath": []

in the workspace file doesn't help either. It still scans the system headers once for each workspace folder.

MSoegtropIMC commented 4 years ago

One more experiment: If I set:

"C_Cpp.default.browse.path": ["/Users/<me>/Empty"],
"C_Cpp.default.compilerPath": "",
"C_Cpp.default.macFrameworkPath": ["/Users/<me>/Empty"]

where /Users/<me>/Empty is an existing but empty folder, it still scans the system folders, but as it looks only once. The runtime doesn't seem to be reduced by this, though (I am at 12 minutes now). Do I need to clear the cache and restart from scratch for each such experiment?

MSoegtropIMC commented 4 years ago

I did one more experiment and this is simply creating a separate workspace which contains only my C files (two workspace folders). For this workspace cpptools takes a perfectly reasonable 11 seconds, including parsing all system headers two times. My other project, which contains the same number of C files but many additional files in other languages did run 27 minutes today.

So in summary I would say:

sean-mcmanus commented 4 years ago

@MSoegtropIMC That is odd. All I had to do on Mac was to set C_Cpp.default.compilerPath to "" (in the workspace file) and it caused all the system/framework paths to not be iterated over, although I did hit an issue on Windows where I had to reload window before the setting would take effect on the non-active workspace folders. Also, the processing of the system folders was very fast, so even if it was re-done per-workspace folder it only took seconds -- is your limitSymbolsToIncludedHeaders set to the default of true? So I don't know what is taking the large CPU time for you. Also, my comment to use loggingLevel "9" or "7" was for the other user, so if you're concerned with performance, setting it back to "Debug" is recommended.

@Colengms @michelleangela Are one of you able to repro this performance issue with multiroot?

MSoegtropIMC commented 4 years ago

@sean-mcmanus:

So I don't know what is taking the large CPU time for you.

as I said: I think the most time cpptool spends trying to digest nearly 500.000 files in my workspace folders it wasn't made for (mostly HTML, CSS, PNG, various ML style languages). How can I tell cpptool to ignore files in specific workspace folders? And how can I tell it to ignore files with extensions not typically used for C/C++? As far as I can tell cpptool reads all files regardless of extension - including html, pdf, png, binaries, .... Maybe this is because C++ STL headers tend to have no extension at all (e.g. /Library/Developer/CommandLineTools/usr/include/c++/v1/iostream), but maybe one could restrict cpptool to C++ typical extensions or files with no extension (dot in the name) at all.

As I mentioned, the debug log ends with:

  466966 file(s) removed from database

More than 99% of these removed files are non C files in my workspace folders.

Or alternatively have a way to exclude certain extensions and maybe issue a warning when cpptool reads and later removes more than 100 files with a certain extension. Something like:

Parsed and removed 12,345 files with extension .py - set option XYZ to ignore files with this extension in future runs of cpptool.
sean-mcmanus commented 4 years ago

You can add folders to files.exclude -- that should prevent the files from being added to the database. You can exclude file types by adding something like "*/.HTML" and changing C_Cpp.exclusionPolicy to "checkFilesAndFolders" (but it might reduce performance). We don't read or parse the files, just add the names to a database (in case the files are #included)...we have plans to change that eventually. If the files.exclude doesn't solve the performance issue, it could be caused by our path iteration code adding all files to a vector before it checks files.exclude, which we're tracking with https://github.com/microsoft/vscode-cpptools/issues/3123 . It's not normal for 500k files to be removed from the database...that sounds like it could be a bug (not sure what the repro is).

nsoblath commented 4 years ago

@nsoblath The log isn't filled with cpptools/didChangeCppProperties and "Attempting to get defaults from compiler in "compilerPath" property:", right? You mean it's filled with path/directory_ut8 logging, right?

Yeah, my guess would be your system headers paths are triggering #3123 -- you should be able to confirm that via setting your compilerPath to "" and see if the problem is fixed.

@sean-mcmanus That's correct, the log is filled with a whole ton of path/directory_ut8 logging. There are exactly six instances of "Attempting to get defaults . . ." in the entire log that look like this:

Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
fork complete (parent process, child pid = 78159)
terminating child process: 78159
Attempting to get defaults from compiler in "compilerPath" property: '/usr/bin/clang'
fork complete (parent process, child pid = 78170)
terminating child process: 78170

I took your suggestion and set the compilerPath to "". I started with the logging level at "9" and the application quickly grew in memory like it did before. Switching the logging back to "Debug," I got the attached log file after a few minutes. At the moment cpptools appears to be behaving reasonably, as far as its CPU usage goes. I'll mention here if anything changes. Thank you!

cpptools output 4.txt

MarcianoPreciado commented 3 years ago

This appears to have been a problem for years. If you need some more info I recommend you open a fresh MacBook, install VSCode and C/C++ tools and set it loose on Zephyr RTOS, and watch it melt your new MacBook. In 5 min I found the following, they may or may not be helpful. But this tool is useless on MacOS because it forces it to overheat in minutes, and runs ALL the time.

2020: MacOS:6149

2019: Linux:3507 Linux:3213 Linux:4418 Linux:2991

2018: Linux:1846 MacOS:2742

MSoegtropIMC commented 3 years ago

What I don't understand is why this has a "need repro" tag. For me this happens always on any OS (I am using Mac, Windows, and several Linux distros) with any real world project. is it really so that none of the maintainers can reproduce this?

sean-mcmanus commented 3 years ago

@MSoegtropIMC Yeah, I don't believe I've seen a repro yet, other than https://github.com/microsoft/vscode-cpptools/issues/3123 . We've tested on the Linux kernel, LLVM, Chromium, etc.

This issue is also tagged with "investigate" so there's probably stuff we could do to look into a repro.

MSoegtropIMC commented 3 years ago

@sean-mcmanus : can you tell me on what OS you are working? Then I can test it there with the latest released VSCode and prepare a small script which downloads some open source software and send a zip with matching vscode project file(s).

sean-mcmanus commented 3 years ago

@MSoegtropIMC Windows 10 x64 and ARM64, Mac Big Sur x64, Ubuntu 20 x64, and Raspbian/Debian 10 armhf and aarch64.

MSoegtropIMC commented 3 years ago

@sean-mcmanus : perfect, thanks! I am sure we can find an overlap between your and my systems, where this can be reproduced.

fyta2000 commented 3 years ago

Any updates on if this can be fixed? I've been searching a long time for a solution. IntelliSense finds over 5 million files. LOL.

Setting "C_Cpp.default.browse.path" does not work, because this does not limit which files it scans. Setting "files.exclude does not work, because this removes my ability to see build output folders in VSCode.

I want to exclude the build output folders from the cpptools scanning.

I'm on Mac 10.15.7, VSCode 1.53.2, IntelliSense v1.2.2-insider. My build system is Bazel, and my workspace contains the repo root.

I've seen the previous messages about symlinks, and this is probably the problem, since Bazel produces symlinks for the build output folders. Therefore, I would be fine just telling Intellisense to ignore them, so it will stop following the symlinks and thinking there is 5 million files to scan.

sean-mcmanus commented 3 years ago

@fyta2000 We're working making files.exclude work during the file scanning phase (https://github.com/microsoft/vscode-cpptools/issues/3123) -- if that fixes your issue, we could also consider adding some additional parse.exclude setting.

kg commented 3 years ago

I'm experiencing this in my linux development VM for https://github.com/dotnet/runtime, running a VS Code client on Windows connected to the linux host (so cpptools runs on the linux host along with some node processes, while the UI runs outside of the linux host). Has any testing happened with that repo for this issue? It does seem to be much worse during builds, which is confusing because I got the impression that vscode takes gitignore into account for these things like it does for project-wide search and build artifact directories are ignored.

When I checked in and opened top, cpptools' cpu time total was around 65 hours, with xorg and konsole at 23 and 21 hours and node (probably the cpptools parent?) at 19. cpptools also was using about half a gig of memory (2.4g virtual).

fyta2000 commented 3 years ago

@fyta2000 We're working making files.exclude work during the file scanning phase (#3123) -- if that fixes your issue, we could also consider adding some additional parse.exclude setting.

OK, so what I tried was to add the bazel build output symlink paths to the "files.exclude", and this seems to mitigate the issue. However, I do want to see these output paths in my workspace, so I can see the outputs of my build (generated code, build artifacts, etc.).

So, just more FYI, my build output paths will look like /bazel-*/. These are just symlinks to somewhere else. But, some of these symlinks contain symlinks, and we end up with cpptools thinking there are a millions of files. Maybe cpptools could resolve the paths, so symlink paths are removed, then it should be able to realize it's already gone down a certain path.
For now, what I did was add to "files.exclude" and entry like this: "**/bazel-testlogs/**/bazel-*/" : true,, so that the recursive symlink paths get removed from this particular symlink path.

sean-mcmanus commented 3 years ago

@fyta2000 To see the output paths in your workspace, you can add it to a separate workspace folder that has C_Cpp.default.browse.path = [].

However, FYI, we haven't implemented the files.exclude handling in the scanning phase yet, but it should avoid any extra tag parsing later on.

chenhengqi commented 3 years ago

Is cpptools itself open-sourced ?

chenhengqi commented 3 years ago

I also suffer this problem frequently.

Here is an output of the BCC tool profile tracing the cpptools process.

cpptools.profile

sean-mcmanus commented 3 years ago

@chenhengqi cpptools is not open source. Are you able to analyze the profile data to determine which call stack is causing the problem? I'm not familiar with the BCC tool.

chenhengqi commented 3 years ago

@sean-mcmanus I've tried on linux, but it seems some symbols are optimized out.

sean-mcmanus commented 3 years ago

@chenhengqi What you do mean by "optimized out"? They symbols for call stacks should be available on Linux without much being optimized out. I see the call stacks in your cpptools.profile.