Closed alexdima closed 8 years ago
This file is the database of symbols in your files. Depending on the size of the project the database can become large. The file sizes you listed look normal. You can exclude it from git as you did. You can also control the location of the database (so it is outside your repo). See the two attached screenshots for more details. If you edit the cpp settings file you can find a databaseFilename setting. Use the full path (directory and file name) you want to use. If it's an empty string (or missing from the settings file) then it'll go to the default location.
mine is 20+ GB.
@tojocky Wow, that seems too big. The largest I've seen is 1.4 GB for Chromium. Does changing some of the settings to reduce the size work for you? You can use files.exclude to remove directories and files that you don't care about having symbols for, and limitSymbolsToIncludedHeaders to true might help too, and setting addWorkspaceRootToIncludePath to false and then selectively adding the directories you actually want symbols for should help. You should also delete the database or change the databaseFilename after making these settings changes because the database doesn't self-clean and can accumulate junk from older settings (which we've been planning to fix for a while). This could also be a new bug due or due to symbolic link cycles, but we would need more info to tell.
OK, I'm back to ~20GB
@tojocky Can you provide more info? Do you think this is a bug? You should be able to workaround the issue via deleting the database file (or changing databaseFilename) after reducing the scope of the browse.path setting to not include so many files. Our database adds all the filenames it recursively detects from browse.path and then parses files for symbol information for files it believes are C/C++. So it's either finding too many files and/or parsing too many files. You could possibly help us diagnose the issue via opening the .browse.vc.db file with a SQLite viewer and looking for what's causing the size bloat. It also doesn't remove files from the database that no longer exist in the browse.path, requiring a manual deletion to clean up (an issue we are planning to fix in September).
This time I used sqlite3_analyzer.exe to understand what is going on. Seems the table CODE_ITEMS with it indexes takes most of the space.
I ran the SQL command: "select count(*) from code_items;" and the result is: 110702994
/** Disk-Space Utilization Report For C:\Users\ion.lupascu\AppData\Roaming\Code\User\workspaceStorage\dc5891a1df997736f6106d3d0a76af58\ms-vscode.cpptools\.BROWSE.VC.DB
Page size in bytes................................ 4096
Pages in the whole file (measured)................ 5213044
Pages in the whole file (calculated).............. 5213043
Pages that store data............................. 5213042 100.000%
Pages on the freelist (per header)................ 1 0.0%
Pages on the freelist (calculated)................ 2 0.0%
Pages of auto-vacuum overhead..................... 0 0.0%
Number of tables in the database.................. 15
Number of indices................................. 37
Number of defined indices......................... 30
Number of implied indices......................... 7
Size of the file in bytes......................... 21352628224
Bytes of user payload stored...................... 8825666906 41.3%
*** Page counts for all tables with their indices *****************************
CODE_ITEMS........................................ 4808119 92.2%
FILE_SIGNATURES................................... 163027 3.1%
FILES............................................. 102832 2.0%
ASSOC_TEXT........................................ 62247 1.2%
ASSOC_SPANS....................................... 58479 1.1%
BASE_CLASS_PARENTS................................ 18307 0.35%
CONFIGS........................................... 5 0.0%
FILE_MAP.......................................... 5 0.0%
CONFIG_FILES...................................... 4 0.0%
PROJECTS.......................................... 4 0.0%
SQLITE_MASTER..................................... 4 0.0%
SHARED_TEXT....................................... 3 0.0%
CODE_ITEM_KINDS................................... 2 0.0%
PARSERS........................................... 2 0.0%
PROPERTIES........................................ 2 0.0%
*** Page counts for all tables and indices separately *************************
CODE_ITEMS........................................ 2146534 41.2%
IX_CODE_ITEMS_NAME................................ 558092 10.7%
IX_CODE_ITEMS_PARENT_ID_KIND...................... 478439 9.2%
SQLITE_AUTOINDEX_CODE_ITEMS_1..................... 428285 8.2%
IX_CODE_ITEMS_PARENT_ID........................... 416587 8.0%
IX_CODE_ITEMS_LOWER_NAME_HINT..................... 390802 7.5%
IX_CODE_ITEMS_FILE_ID............................. 389380 7.5%
FILE_SIGNATURES................................... 158169 3.0%
FILES............................................. 53016 1.0%
ASSOC_TEXT........................................ 44283 0.85%
UQ_FILES_NAME..................................... 37322 0.72%
ASSOC_SPANS....................................... 25118 0.48%
UQ_ASSOC_SPANS_CODE_ITEM_ID_KIND.................. 17383 0.33%
IX_ASSOC_SPANS_CODE_ITEM_ID....................... 15978 0.31%
UQ_ASSOC_TEXT_CODE_ITEM_ID_KIND................... 9400 0.18%
IX_FILES_LEAF_NAME................................ 8851 0.17%
IX_ASSOC_TEXT_CODE_ITEM_ID........................ 8564 0.16%
UQ_BASE_CLASS_PARENTS_BASE_CODE_ITEM_ID_PARENT_CODE_ITEM_ID 5577 0.11%
BASE_CLASS_PARENTS................................ 4642 0.089%
IX_BASE_CLASS_PARENTS_BASE_CODE_ITEM_ID........... 4044 0.078%
IX_BASE_CLASS_PARENTS_PARENT_CODE_ITEM_ID......... 4044 0.078%
SQLITE_AUTOINDEX_FILES_1.......................... 3643 0.070%
UQ_FILE_SIGNATURES_FILE_ID_KIND................... 2533 0.049%
IX_FILE_SIGNATURES_FILE_ID........................ 2325 0.045%
SQLITE_MASTER..................................... 4 0.0%
CODE_ITEM_KINDS................................... 1 0.0%
CONFIG_FILES...................................... 1 0.0%
CONFIGS........................................... 1 0.0%
FILE_MAP.......................................... 1 0.0%
IX_CONFIG_FILES_CONFIG_ID......................... 1 0.0%
IX_CONFIG_FILES_FILE_ID........................... 1 0.0%
IX_CONFIGS_NAME................................... 1 0.0%
IX_CONFIGS_PROJECT_ID............................. 1 0.0%
IX_FILE_MAP_CODE_ITEM_ID.......................... 1 0.0%
IX_FILE_MAP_CONFIG_ID............................. 1 0.0%
IX_FILE_MAP_FILE_ID............................... 1 0.0%
IX_SHARED_TEXT_HASH............................... 1 0.0%
PARSERS........................................... 1 0.0%
PROJECTS.......................................... 1 0.0%
PROPERTIES........................................ 1 0.0%
SHARED_TEXT....................................... 1 0.0%
SQLITE_AUTOINDEX_CONFIGS_1........................ 1 0.0%
SQLITE_AUTOINDEX_PARSERS_1........................ 1 0.0%
SQLITE_AUTOINDEX_PROJECTS_1....................... 1 0.0%
SQLITE_AUTOINDEX_PROPERTIES_1..................... 1 0.0%
SQLITE_AUTOINDEX_SHARED_TEXT_1.................... 1 0.0%
UQ_CODE_ITEM_KINDS_NAME_PARSER_GUID............... 1 0.0%
UQ_CONFIG_FILES_CONFIG_ID_FILE_ID................. 1 0.0%
UQ_CONFIGS_PROJECT_ID_NAME........................ 1 0.0%
UQ_FILE_MAP_CODE_ITEM_ID_CONFIG_ID_FILE_ID........ 1 0.0%
UQ_PROJECTS_GUID.................................. 1 0.0%
UQ_PROJECTS_NAME.................................. 1 0.0%
I also ran the command "select * from code_items limit 50;" and I see things like:
"27" "1" "0" "35" "65538" "iomanip" "" "1" "31" "18" "31" "0" "0" "0" "0" "" "NULL" "NULL" "NULL" "NULL" "NULL" "ioma"
"28" "1" "0" "35" "65538" "math.h" "" "1" "32" "17" "32" "0" "0" "0" "0" "" "NULL" "NULL" "NULL" "NULL" "NULL" "math"
"29" "1" "0" "35" "65538" "algorithm" "" "1" "33" "20" "33" "0" "0" "0" "0" "" "NULL" "NULL" "NULL" "NULL" "NULL" "algo"
except my hpp files.
also I checked how many times a file is repeated by running: "select count(*) from code_items where name="iomanip";": 2098
Question: is this the # of lines?
Let me know if you need more info.
@tojocky Code items are symbols. It looks like your code base has lots of symbols. Do you believe this is expected or does it seem like a bug to you? If non-C/C++ files are being incorrectly parsed due to a file association mapping, that might cause too many symbols to be generated. You could try using files.exclude to remove sections of your code base, which should cause the symbol to be removed. Is the 20 GB database a problem for you? Is performance slow or is it just hogging disk space?
Hi @sean-mcmanus . Regarding performance I can't complain, Thank you for the great job. For C++ projects I wanted to use a modern IDE, but I'm fine with vim and sometime sublimetext. This project is a really huge.
The only issue is just hogging disk space.
I will consider to use files.exclude setting.
BTW, instead of encoding file name in each code item isn't better to isolate into a separate table with a primary key? It will avoid a filename to be repeated 1000s of times plus the index also takes a lot of space.
A NoSQL DB would be better.
This is just what I'm thinking.
1.Ctrl+P 2.Open c_cpp.properties.json 3.Edit the follow node:
"browse": {
"path": [
"${workspaceFolder}",
"D:/Program Files/VS2017/VC/Tools/MSVC/14.11.25503/include/*",
"D:/Program Files/VS2017/VC/Tools/MSVC/14.11.25503/atlmfc/include/*",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/um",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/ucrt",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/shared",
"C:/Program Files (x86)/Windows Kits/10/Include/10.0.15063.0/winrt"
],
"limitSymbolsToIncludedHeaders": true,
"databaseFilename": "D:/Others/VSCode/browse.vc.db"
},
4.Change "databaseFilename"
value to location where you want to store the browse.vc.db
file.
@ljf1239848066 What's the problem? How big is your file?
@sean-mcmanus More than 20G.
@ljf1239848066 Is your workspace really big? How may files are getting discovered/parsed? If your loggingLevel it high enough it should show that info in the C/C++ Output window.
I'm working on aosp project with different branches, so i need to open several instance at the same time, totally nearly a million files. My reply is to figure out a solution to changed the .db file out of C disk to avoid lacking of space.
Moved from Microsoft/vscode#10557
From @bharathitman
Steps to Reproduce:
From @AkashGutha
Had same problems but was 40Mb in size though ? What is this file about ?