jacobdufault / cquery

C/C++ language server supporting multi-million line code base, powered by libclang. Emacs, Vim, VSCode, and others with language server protocol support. Cross references, completion, diagnostics, semantic highlighting and more
MIT License
2.35k stars 163 forks source link

Create a client for Sublime Text #363

Open papadokolos opened 6 years ago

papadokolos commented 6 years ago

Hello! :smile_cat:

I'm a C++ developer, and I'm using Sublime Text 3 as my primary editor. I recently discovered this exciting project, and I must begin with a big Thank You for this amazing effort.

If I wasn't clear yet, I would like to bring cquery to Sublime Text :smile:.

For this purpose, I'll write here important information which should help you decide how to approach my request.

Introduction to Sublime Text 3

Sublime Text 3 is a text editor written in C++ at its core, and is easily hackable/extendable by using its well documented python API for plugin developers.

Currently, Sublime is pretty wide spread, and has an active (and very kind :stuck_out_tongue_winking_eye: ) plugin development community. Most of the plugins are available via Package Control, the official package manager.

Sublime Text as C++ IDE

Marketing itself as a text editor, Sublime does not focus on providing IDE-like features, but rather on simplicity of use and performance. As a result, most of the C++ oriented features are coming from external plugins, similarly to vscode.

However, Sublime Text's core (which is written in C++) is very efficient, and as such, plugin developers are encouraged to use it (via the python API) in order to complete their tasks. In other words, it is better to let Sublime to do the work for you, when possible, than doing it by yourself, since it will probably be more efficient.

Furthermore, it's worth noting that everything in Sublime Text supports fuzzy searching, which is a very powerful feature, that just as well, happenes efficiently.

Currently Available features/plugins

In order to give you a complete picture, I will write both the built-in features available to a new, registered, Sublime Text user (with the newest development build), and the "most feature rich" state which combines several plugins to achieve a more C++ IDE-like experience.

Core C++ Oriented Features (Sublime Text Build 3156)

Regex based syntax highlighting

Sublime provides syntax highlighting using a custom regular expression engine, and supports both TextMate's .tmLanguage file format as well as as its own .sublime-syntax file format. The core syntax highlighting files are available at Sublime's Core Packages Repository.

Hashed Syntax Highlighting

As of build 3153, Sublime is able to draw different words with different colors based on their hash:

jsp (Sublime Text Developer): "A given word is assigned a color from a table based on the hash of its contents. This way most variable names etc are different colors, but the same name always has the same color."

GoTo Symbol Definition/Declaration/References - file wide and project wide

Sublime indexes the entire project internally, based on the regex syntax highlighting. After it finishes, it gives you the ability to either navigate to symbol's definition/declaration/references, and to search for symbols in the current file, and even project wide. This can be done both by keyboard shorcuts and by mouse hover.

BUT, and this is a big one, this whole thing is done non sematically! All symbols with the same name are being proposed (no context filtering, even by language!), which makes it harder to find the right symbol, and many symbols are left unrecognized due to the complex C++ syntax which can't be parsed entirely by using regex alone.

Snippets

Sublime's C++ default package provides some built in snippets for class, if/else, ifndef, namesapce, and such.

Sublime Text's API Limitations

Before I go on with external plugins, I would like to mention the limitations of Sublime's API:

  1. There currently is no API for overrding a word's color, beside using a custom syntax highlighting file. Which means that there seems to be no way to make semtantic syntax highlighting.
  2. There is no way to interfere with Sublime's internal symbols index. Which means that file and project wide symbol indexing must be handled at the plugin side.
  3. Sublime has yet to provide an actual GUI API to create buttons, frames and such. The closest thing to it is the tooltip which supports HTML and CSS formatting (minihtml). This results in one of Sublime's top weakness - the lack of a good debugger/debuggig environment.

Most C++ Feature Rich Sublime Configuration - With External Plugins

Semantic Autocomplete, GotTo Declaration, Diagnostics, and Documentation on hover

Using the plugin EasyClangComplete, I have flawless semantic autocompletion of symbols (with snippets), triggered automatically when appropriate; Context aware GoTo Declaration (not definition); Real time diagnostics with inline decorations and popups; and documentation on hover, with a peek to the symbol's declaration.

This plugin is based on libclang's python bindings, and take as much information as it can from a single translation unit. That means, this plugin does not handle multiple translation units together, and does not provide project wide information.

Semantic GoTo Definition and Find All References

Using my own plugin, which is based on RTags, I implement anything that EasyClangComplete doesn't already give me. So far, I implemented GoTo Definition, Find All References, and Find Overrides of Virtual Method.

Include Paths Autocompletion

Using the plugin Include Autocomplete which is compile_commands.json aware.

Documentation Generation

Using the plugin DoxyDoxygen, I create and update code documentation with ease. It parses classes and function's signature and generates a documentation snippet in real time. Furthermore, it is able to update the documentation when the signature changes.

Debugger - untested

Using SublimeGDB it is possible to debug your C++ program. I havn't tried it myself yet but it seems to be a popular package that tries to make the best out of the poor GUI API that is currently available.

Note

There are more plugins and features which are relevant for a C++ developer which uses Sublime Text, but these are not related to cquery's abilities in any way, so there is no point to mention them here.

How can you integrate cquery into Sublime Text

I can think of two ways in which cquery can be integrated into Sublime Text. I would love to hear your thoughts about each of them:

  1. Make it work with Sublime Text's LSP-client plugin (see #208).
  2. Create a dedicated cquery client, which enables the use of its extra, non lsp-standard supported features (either from scratch, or by forking/copying code of the existing plugins).

Information Resources

Here are useful links to the best sources for information about Sublime Text:

Does it worth the effort? Is it better than RTags?

As you can see, I got most of the stuff figured out already. But this project got me wondering whether I should stick with RTags, or switch to cquery.

cquery seems to bring everything in one package, like RTags. It looks like it is very actively developed, and you answer to issues very quickly (as opposed to RTags' developers), which is very important for me as a user. Moreover, cquery claims to be faster than RTags, and it supports LSP.

So I would like you to answer these questions:

  1. Are there features that currently provided by RTags, which are not available with cquery at the moment?
  2. Are there features that currently provided by cquery, which are not available with RTags at the moment?
  3. How noticeable is the performance difference between RTags and cquery? Please refer to RAM usage, CPU usage, and time measurements.
  4. What are the recommended hardware specifications for a good use of cquery? What is needed to get the true performance boost (you mention high cache usage, do you mean RAM usage)?

Thank you very much for your time, I spent the entire day to write this to you :sunglasses: Have a lovely week!

MaskRay commented 6 years ago

😊

Regex based syntax highlighting Hashed Syntax Highlighting

cquery implements rainbow semantic highlighting https://github.com/cquery-project/cquery/blob/master/src/message_handler.cc EmitSemanticHighlighting

Regex based syntax highlighting is still good because files in editing may not compile and compilation has a high latency. There are some diff algorithm based heuristics to make cross references when the document differs from its indexed version.

GoTo Symbol Definition/Declaration/References - file wide and project wide

These LSP requests are most useful

There are also some fancy usage and some cquery cross reference extensions (e.g. $cquery/dervided), see https://github.com/cquery-project/cquery/wiki/FAQ#definitions

Snippets

Snippets are supported via completion. The responses to textDocument/completion requests support ${2:foo} like snippets.

Semantic Autocomplete, GotTo Declaration, Diagnostics, and Documentation on hover

textDocument/pushDiagnostics

textDocument/definition on a definition lists its declarations while textDocument/definition on declarations or references jumps to the definition.

textDocument/hover currently displays the type signature with fully qualified names, e.g. std::vector<int> hello::foo(char a, ...).

Include Paths Autocompletion

Supported. BTW, textDocument/references on a #include directive lists all files including that file. textDocument/definition jumps to the included file.

Find Overrides of Virtual Method.

A $cquery/derived request finds derived methods/classes while $cquery/base finds base method/class. $cquery/memberHierarchy lists the type hierarchy but this is visualized in neither the VSCode plugin nor the Emacs plugin 😂 If someone is interested at cquery.el, I made an issue about generic symbol hierarchy

Documentation Generation

Comments are indexed by default. If the client supports it, textDocument/hover can display comments. There is also a documentation field in LSP CompletionItem.documentation

Debugger

I'd like to see Debug Protocol attracts more attention.

How can you integrate cquery into Sublime Text

A dedicated plugin may still be needed for some cquery extensions. See those src/messages/cquery*.cc files The organization in the Emacs lsp ecosystem may give some insights.

https://github.com/cquery-project/emacs-cquery/ is a plugin that leverages lsp-mode (which is a generic LSP client), because semantic highlighting, cross reference extentions ($cquery/derived $cquery/base ...) are not in standard LSP.

Are there features that currently provided by RTags, which are not available with cquery at the moment?

Macros

RTags supports indexing of function calls from one-level macro expansions. See https://github.com/cquery-project/cquery/issues/331 I haven't figured out a way to do it elegantly.

Dependency

  // a.h
  extern int foo;
  // a.cc
  #include "a.h"
  int foo;
  // b.cc
  extern int foo;

Out-of-band changes to a.h (outside of the editor) and invokes $cquery/freshenIndex or workspace/didChangeWatchedFiles on a.h, the declarations of the variable foo do not update yet. But for a.cc, the declarations update. I don't know how well RTags handles this scenario.

RTags uses a C/S model while cquery is started by the editor and they communicate through stdin/stdout. Pipe (Emacs lsp-mode, LanguageClient-neovim), or Unix socket (VSCode). RTags does not support after-change-functions through which it can update the document cross reference information without a save. RTags supports re-read of compile_commands.json. cquery currently does not. It should be improved.

Symbol information

rtags-symbol-info stores much information retrieved from libclang, while cquery stores less (see Index{Type,Var,Func} in https://github.com/cquery-project/cquery/blob/master/src/indexer.h )

Templates

How well does RTags handle template specialization? See https://github.com/cquery-project/cquery/issues/353

RTags traverses the translation unit top-down while cquery uses higher level clang_indexTranslationUnit through declaration/reference callbacks.

There is some limitation:

// U is not indexed because it is not referenced
template<class T, class U>
void f(T a) {}

No way to retrieve the location of : int32_t with libclang API, but possible with C++ EnumDecl::getIntegerTypeRange().

typedef int int32_t;
enum E : int32_t { E0, E1 };

RTags has rtags-print-enum-value-at-point but cquery saves the constant value in IndexVar::def::hover and presents it for textDocument/hover requests.

typedef int int32_t;
enum E : int32_t {
  E0,
  E1,
};

E1's incarnation in the cache file (IndexVar in IndexFile)

   {
      "id": 2,
      "usr": 8563738222191832000,
      "short_name": "E1",
      "detailed_name": "E::E1",
      "kind": 15,
      "storage": 0,
      "hover": "E::E1 = 1",
      "declarations": [],
      "definition_spelling": "5:3-5:5",
      "definition_extent": "5:3-5:5",
      "variable_type": 1,
      "declaring_type": 1,
      "uses": []
    }

How noticeable is the performance difference between RTags and cquery? Please refer to RAM usage, CPU usage, and time measurements.

I think RTags does not do well in terms of query performance. https://github.com/Andersbakken/rtags/issues/1007

# cquery cache files + copies of source files
% du -sh .vscode/cquery_cached_index
91M     .vscode/cquery_cached_index
# The bin/cquery process
# VIRT 1255MB
# RES 294MB

# rtags cache files
% du -sh ~/.cache/rtags/_home_ray_Dev_Bin_radare2_
567M    /home/ray/.cache/rtags/_home_ray_Dev_Bin_radare2_
# The /usr/bin/rdm process
# VIRT 345MB
# RES 67MB

The on-disk storage of cquery can also be improved.

llvm+clang+libcxx+libcxxabi takes 53 minutes to index on my laptop (X1 Carbon)

% du -sh .vscode/cquery_cached_index
630M    .vscode/cquery_cached_index
% stat -c %s .vscode/cquery_cached_index/**/*.mpack | awk '{s+=$1}END{print s}'    
394806885
VIRT 4631MB
RES 1565MB 

Restart cquery to load the cache files:

38.610s  ..... Applying index update
VIRT 3377MB
RES 1416MB

But you can send queries when cquery is loading the cache. https://github.com/cquery-project/cquery/wiki/Design

jacobdufault commented 6 years ago

RTags uses a C/S model while cquery is started by the editor and they communicate through stdin/stdout. Pipe (Emacs lsp-mode, LanguageClient-neovim), or Unix socket (VSCode).

fwiw vscode also uses a pipe, cquery does not currently support socket-based LSP.

It'd be really great if cquery was integrated into sublime text. How you choose to do so is up to you, but I'd recommend trying to leverage any existing LSP protocol support. cquery only has a couple of LSP protocol extensions, and it does so such that it should be relatively simple to extend an additional LSP client to support the additional methods.

Thanks for the interest! I hope we get great sublime text integration as a result :)

MaskRay commented 6 years ago

fwiw lsof +E -aUc querydb -d 0 => "TYPE => Unix" : the "querydb" thread (main thread of the cquery process) communicates with VSCode via Unix domain socket. Its stdin/stdout can be redirected to pipe or Unix domain socket, but it is transparent to the program. cquery uses std::cout fread but the underlying IO mechanism can be different

MaskRay commented 6 years ago

@papadokolos It would be really nice if you could create a client for Sublime Text.

I'm so excited to highlight the two new features in the https://github.com/cquery-project/cquery/releases/tag/v20180213 release

textDocument/definition in comments will search for the qualified identifier at point approximately in all symbols. CXSymbolRole differentiates read/write references and Emacs lsp-mode assigns different colors for them. There are no other libclang users supporting this as it just landed in libclang yesterday!

Jo2003 commented 6 years ago

Yes, a sublime client would be really nice. I tried it with LSP plugin, but had no luck.

papadokolos commented 6 years ago

It requires around six restarts of Sublime Text to make it work with LSP, but I can assure you that it does work, partially.

I managed to make it run the cquery server, and to communicate with it for diagnostics, auto-complete and goto-definition. But, unfortunately, it is unstable and sometimes behaves unexpectedly.

Jo2003 commented 6 years ago

Could you share your LSP configuration, please? Thanks!

Jo2003 commented 6 years ago

OK, I have it working and it is stable (Ubuntu 16.04). These are my user settings for LSP:

{
    "clients":
    {
        "cquery":
        {
            "languageId": "cpp",

            "command":
            [
                "/opt/cquery/build/release/bin/cquery",
                "--log-file=/var/log/cquery/cquery.log"
            ],

            "scopes":
            [
                "source.c",
                "source.c++"
            ],

            "syntaxes":
            [
                "Packages/C++/C.sublime-syntax",
                "Packages/C++/C++.sublime-syntax"
            ],

            "initializationOptions":
            {
                "cacheDirectory": "/home/me/.cquery"
            }
        }
    }
}

Take care for cacheDirectory. Furthermore - in case you want to have a log file - make sure the folder for the log file is writable by you.

papadokolos commented 6 years ago

I'm glad you worked it out! But are you sure that it is working smoothly?

I, for example, have an issue with the diagnostics, which seems to always lag behind the current state of the file. For example, if a whole word causes a compilation error, only the first letter is marked. This gets fixed when I save the file.

jacobdufault commented 6 years ago

@papadokolos There have been a lot of changes to cquery, it is likely that it works better now :)

Jo2003 commented 6 years ago

I think there is some stuff we can optimize in the settings. I also saw that my code is checked while I'm writing. This is something I don't like. I should be informed about problems on save. Having a look in the log file I see much settings which can be changed through initializationOptions. I still have to invest some time into this.

jacobdufault commented 6 years ago

I also saw that my code is checked while I'm writing.

Set diagnostics.frequencyMs to -1.

https://github.com/cquery-project/cquery/blob/master/src/config.h#L153

Jo2003 commented 6 years ago

Here are my complete LSP settings for sublime. Maybe someone will find it helpful ...

// Settings in here override those in "LSP/LSP.sublime-settings",
{
    "clients":
    {
        "cquery":
        {
            "languageId": "cpp",

            "command":
            [
                "/opt/cquery/build/release/bin/cquery",
                "--log-file=/var/log/cquery/cquery.log"
            ],

            "scopes":
            [
                "source.c",
                "source.c++"
            ],

            "syntaxes":
            [
                "Packages/C++/C.sublime-syntax",
                "Packages/C++/C++.sublime-syntax"
            ],

            "initializationOptions":
            {
                // Allow indexing on textDocument/didChange.
                // May be too slow for big projects, so it is off by default.
                // "enableIndexOnDidChange": false,

                // Root directory of the project. **Not available for configuration**
                // "projectRoot": "",

                // If specified, this option overrides compile_commands.json and this
                // external command will be executed with an option |projectRoot|.
                // The initialization options will be provided as stdin.
                // The stdout of the command should be the JSON compilation database.
                // "compilationDatabaseCommand: "",

                // Directory containing compile_commands.json.
                // "compilationDatabaseDirectory": "",

                // Cache directory for indexed files.
                "cacheDirectory": "/home/me/.cquery",

                // Cache serialization format.
                //
                // "json" generates `cacheDirectory/.../xxx.json` files which can be pretty
                // printed with jq.
                //
                // "msgpack" uses a compact binary serialization format (the underlying wire
                // format is [MessagePack](https://msgpack.org/index.html)) which typically
                // takes only 60% of the corresponding JSON size, but is difficult to inspect.
                // msgpack does not store map keys and you need to re-index whenever a struct
                // member has changed.
                // "cacheFormat": "json",

                // Value to use for clang -resource-dir if not present in
                // compile_commands.json.
                //
                // cquery includes a resource directory, this should not need to be configured
                // unless you're using an esoteric configuration. Consider reporting a bug and
                // fixing upstream instead of configuring this.
                //
                // Example value: "/path/to/lib/clang/5.0.1/"
                // "resourceDirectory": "",

                // Additional arguments to pass to clang.
                "extraClangArguments":
                [
                    "-Wno-inconsistent-missing-override",
                    "-Wno-format",
                    "-Wno-extern-c-compat"
                ],

                // If true, cquery will send progress reports while indexing
                // How often should cquery send progress report messages?
                //  -1: never
                //   0: as often as possible
                //   xxx: at most every xxx milliseconds
                //
                // Empty progress reports (ie, idle) are delivered as often as they are
                // available and may exceed this value.
                //
                // This does not guarantee a progress report will be delivered every
                // interval; it could take significantly longer if cquery is completely idle.
                // "progressReportFrequencyMs": 500,

                // If true, document links are reported for #include directives.
                // "showDocumentLinksOnIncludes": true,

                // Version of the client. If undefined the version check is skipped. Used to
                // inform users their vscode client is too old and needs to be updated.
                // "clientVersion": 1,

                "client":
                {
                    // TextDocumentClientCapabilities.completion.completionItem.snippetSupport
                    // "snippetSupport": false
                },

                "codeLens":
                {
                    // Enables code lens on parameter and function variables.
                    // "localVariables": true
                },

                "completion":
                {
                    // Some completion UI, such as Emacs' completion-at-point and company-lsp,
                    // display completion item label and detail side by side.
                    // This does not look right, when you see things like:
                    //     "foo" "int foo()"
                    //     "bar" "void bar(int i = 0)"
                    // When this option is enabled, the completion item label is very detailed,
                    // it shows the full signature of the candidate.
                    // The detail just contains the completion item parent context.
                    // Also, in this mode, functions with default arguments,
                    // generates one more item per default argument
                    // so that the right function call can be selected.
                    // That is, you get something like:
                    //     "int foo()" "Foo"
                    //     "void bar()" "Foo"
                    //     "void bar(int i = 0)" "Foo"
                    // Be wary, this is quickly quite verbose,
                    // items can end up truncated by the UIs.
                    "detailedLabel": true,

                    // On large projects, completion can take a long time. By default if cquery
                    // receives multiple completion requests while completion is still running
                    // it will only service the newest request. If this is set to false then all
                    // completion requests will be serviced.
                    // "dropOldRequests": true,

                    // If true, filter and sort completion response. cquery filters and sorts
                    // completions to try to be nicer to clients that can't handle big numbers
                    // of completion candidates. This behaviour can be disabled by specifying
                    // false for the option. This option is the most useful for LSP clients
                    // that implement their own filtering and sorting logic.
                    "filterAndSort": false,

                    // Regex patterns to match include completion candidates against. They
                    // receive the absolute file path.
                    //
                    // For example, to hide all files in a /CACHE/ folder, use ".*/CACHE/.*"
                    "includeBlacklist":
                    [
                        ".*/deprecated/.*"
                    ],

                    // Maximum path length to show in completion results. Paths longer than this
                    // will be elided with ".." put at the front. Set to 0 or a negative number
                    // to disable eliding.
                    // "includeMaxPathSize": 30,

                    // Whitelist that file paths will be tested against. If a file path does not
                    // end in one of these values, it will not be considered for
                    // auto-completion. An example value is { ".h", ".hpp" }
                    // default are .h, .hh, .hpp. 
                    // values given here will be appended to the defaults.
                    // This is significantly faster than using a regex.
                    "includeSuffixWhitelist":
                    [
                        ".hxx"
                    ]
                },

                "diagnostics":
                {
                    // Like index.{whitelist,blacklist}, don't publish diagnostics to
                    // blacklisted files.
                    "blacklist":
                    [
                        ".*/deprecated/.*"
                    ],

                    // How often should cquery publish diagnostics in completion?
                    //  -1: never
                    //   0: as often as possible
                    //   xxx: at most every xxx milliseconds
                    // "frequencyMs": 0,

                    // If true, diagnostics from a full document parse will be reported.
                    // "onParse": true,

                    // "whitelist": []
                },

                "highlight":
                {
                    // Like index.{whitelist,blacklist}, don't publish semantic highlighting to
                    // blacklisted files.
                    // "blacklist": [],

                    // "whitelist": [],
                },

                "index":
                {
                    // Attempt to convert calls of make* functions to constructors based on
                    // hueristics.
                    //
                    // For example, this will show constructor calls for std::make_unique
                    // invocations. Specifically, cquery will try to attribute a ctor call
                    // whenever the function name starts with make (ignoring case).
                    // "attributeMakeCallsToCtor": true,

                    // If a translation unit's absolute path matches any EMCAScript regex in the
                    // whitelist, or does not match any regex in the blacklist, it will be
                    // indexed. To only index files in the whitelist, add ".*" to the blacklist.
                    // `std::regex_search(path, regex, std::regex_constants::match_any)`
                    //
                    // Example: `ash/.*\.cc`
                    // "blacklist": [],

                    // 0: none, 1: Doxygen, 2: all comments
                    // Plugin support for clients:
                    // - https://github.com/emacs-lsp/lsp-ui
                    // - https://github.com/autozimu/LanguageClient-neovim/issues/224
                    // "comments: 2,

                    // If false, the indexer will be disabled.
                    // "enabled": true,

                    // If true, project paths that were skipped by the whitelist/blacklist will
                    // be logged.
                    // "logSkippedPaths": false,

                    // Number of indexer threads. If 0, 80% of cores are used.
                    // "threads": 0,

                    // "whitelist": []
                },

                "workspaceSymbol":
                {
                    // Maximum workspace search results.
                    // "maxNum": 1000,

                    // If true, workspace search results will be dynamically rescored/reordered
                    // as the search progresses. Some clients do their own ordering and assume
                    // that the results stay sorted in the same order as the search progresses.
                    // "sort": true
                },

                "xref":
                {
                    // If true, |Location[]| response will include lexical container.
                    // "container": false,

                    // Maximum number of definition/reference/... results.
                    // "maxNum": 2000
                },

                // For debugging
                // Dump AST after parsing if some pattern matches the source filename.
                // "dumpAST": [],
            }
        }
    }
}