Proposal: Language Server

ObliviousHarmony commented 1 week ago

Is your feature request related to a problem?

After migrating from PHPStorm to VS Code I noticed that the available PHP_CodeSniffer extensions lacked meaningful integration with editor features.

phpcs: Despite being the most popular option, this extension only supports in-line warnings/errors. The extension provides no auto-fix or formatting options at all. It also runs phpcs synchronously for each file.
- Although it continues to "work" depending on your use-case, this extension is currently unmaintained and abandoned.
- There is a fork that has a large number of downloads but it is also abandoned.
PHP Sniffer & Beautifier: This extension provides support for in-line warnings/errors as well as document formatting. This extension does not support in-line fixes and selection document formatting.
PHP Sniffer: Another extension that provides in-line warnings/errors and full/selection document formatting. It does not support in-line fixes.
- It supports selection document formatting by only running against the selected snippets and adding in a <?php tag. While in theory this is great, in practice, it means any warnings/errors registered by a token on an unselected line won't be included in the formatting.

Despite some getting most of the way there, none of these extensions fully utilize VS Code's language tools, nor do they provide feature parity with PHPStorm:

Line/File Ignore: Support for automatically adding phpcs:ignore comments on lines or files using a right-click context menu.
Individual Fixes: Fix selected problem(s) using a right-click context menu.
Performance: Even with a debounce, running phpcs against a large file with a lot of expensive sniffs can take a while.

To be clear, these are limitations imposed by phpcs. These extensions take advantage of the json report type and there is no other provided report that bridges the gap.

Describe the solution you'd like

As developers do, when I encountered these issues, my first instinct was (naturally) to create my own extension (obligatory xkcd). (Technically my first instinct was to contribute upstream but the extensions were all abandoned or tiny at the time.) This extension achieves parity with PHPStorm using a custom report integration that returns warnings/errors and supports both individual fixes and document/selection formatting. This approach allows for a great deal of control over the behavior of phpcs, however, it requires extending internal classes. While the classes I've extended seem relatively safe, it's not ideal for compatibility.

Given how ubiquitous VS Code has become (along with its myriad of popular forks), it seems reasonable for us to invest in providing better support for these kinds of tools. I've got a few possible approaches that I'd like to discuss:

Implement Custom Reports

The meat of my custom integration is an extended Fixer class and File class. It works by ignoring warnings/errors that aren't specifically relevant to the requested report. For instance, individual fixes work by ignoring all warnings/errors other than the given sniff source and token.

The least contentious option would be to add similar functionality directly within PHP_CodeSniffer. This is agnostic to any specific editor or extension and opens the door for others to provide the same functionality anywhere they require it. The main negative I see with this approach is that it would only provide support with the latest version of phpcs. This means any extension would still have to support the old report style moving forward.

Another avenue here could be to add this functionality in a separate Composer package. My extension does something similar to support use-cases where phpcs is run on a guest OS via Docker or similar. This could be bundled with anything that wanted to use this extended functionality.

Language Server

One caveat to distributing a package (or bundling reports) is that it still requires custom extensions to be developed for each editor that wants to use the functionality. Microsoft's Language Server Protocol was created to provide developers a consistent interface for providing language features to any editor. We could provide a language server and then any extension would be free to use it.

Integrated Server

The most robust option is to build a language server into phpcs directly. My assumption is that deeply integrating like this would allow us to take advantage of the benefits of having a long-lived process. For example, could we use an in-memory cache of open files with their tokens and associated messages to save time? I'm not totally sure and if this is something we're interested in then we should profile phpcs and see what kind of savings we would get if we eliminated stuff we can cache.

The caveat is that this feels like a pretty big undertaking. There's a package with Language Server Protocol DTOs but we would still need to add our own RPC reading/writing. This also has the same problem of bundling in that it's only available for the latest version of phpcs.

Separate Server

Microsoft provides a framework for building language servers that takes care of the server/client communication and server lifecycle. All we would need to do is provide an interface between phpcs and the language server. This is essentially what my extension does already, however, it uses VS Code's API directly rather than providing an LSP interface.

The problem with a separate server is that compatibility becomes an issue again. In order to avoid bundling code in the latest phpcs release we would need to provide a custom report type for the server. As the "official" language server we would need to make maintaining compatibility a priority. I haven't thought deeply about this in a while but one option could be a hybrid approach. We could start shipping a language server report in phpcs and provide any necessary polyfills for older versions in the language server. This gives us forward compatibility (since the report and all related modifications are bundled) as well as backward compatibility (a little hacky perhaps but these versions aren't going to ever change).

I lean really heavily towards building a separate language server. It provides the most comprehensive compatibility and if we start bundling the language server report generation we can also get the benefits of deeper integration. As for the server itself, we can take advantage of asynchronous workers and caching to make it as performant as possible. In my experience this has worked out really well.

To be clear, I am totally on board with building the language server. I've had a lot of fun working on my VS Code extension and Microsoft's language server framework is pretty much 1:1 with VS Code's API. There's a few differences and I'd use the opportunity to re-write everything but I don't see this being a significant time investment.

jrfnl commented 1 week ago

@ObliviousHarmony Thanks for opening this issue and I applaud your efforts in creating a VSCode integration.

In my opinion, facilitating the ability for IDEs to create integrations is inside the scope of the PHPCS project (but only just and only when the changes are reasonable). Anything else, like actually providing integrations with IDEs is out of scope.

PHPCS is first and foremost a stand-alone CLI tool. The fact that popular IDEs want to provide support for the tooling is great and welcome, however for any such integration the functionality provided by PHPCS should be leading, not the other way around.

As a side-note: the text in your issue is confusing to me: who are the "we" you keep talking about ?

Implement Custom Reports

It is quite unclear to me what you are proposing here. A custom report, but what is the custom report supposed to do ?

As a rule of thumb, new reports will rarely be accepted. Custom reports can already be provided via an external standard since PHPCS 3.3.0: squizlabs/PHP_CodeSniffer#1948 (and it seems you figured out how to get that working already).

Language Server

Again, it is unclear to me what you are proposing. By the looks of it, there are already several PHP language servers available.

Also, refactoring PHPCS to support a single IDE is completely out of the question.

take advantage of the benefits of having a long-lived process

I must be missing something, but as I said before PHPCS is first and foremost a stand-alone CLI tool, so I honestly don't see how this would apply to PHPCS.

If anything, I urge you to contact the PHPStorm team and teams working on integrations with other IDEs and to come up with something which will be usefull and usable by all. I can even imagine you all creating a (report) protocol which could be used by most, if not all, PHP static analysis tools, including PHPStan, CS-Fixer, Phan etc Preferably, such a project would then also be jointly maintained by you all.

I get the impression that in for both proposals, it's about getting a report for only a small portion of the code in a file. Is that correct ?

In that case, these proposals are a dead-end. Files should always be scanned as a whole as sniffs will often look at the larger context of a file. Some errors may be reported on line 1 if they apply to the file as a whole, which is problematic when reports don't take the whole file into account.

The main negative I see with this approach is that it would only provide support with the latest version of phpcs. This means any extension would still have to support the old report style moving forward. ... This also has the same problem of bundling in that it's only available for the latest version of phpcs.

That will always be the case for changes in PHPCS itself.

All in all, nothing I've read here reads like something which is in-scope for the PHPCS project.

ObliviousHarmony commented 1 week ago

Thanks for the feedback @jrfnl!

As a side-note: the text in your issue is confusing to me: who are the "we" you keep talking about ?

Sorry, "we" is "me!" I used to work for an executive that was obsessed with everyone always using the royal "we". I guess it just ended up sticking with me. You're the first person to ever question it :smile:

It is quite unclear to me what you are proposing here. A custom report, but what is the custom report supposed to do ?

My calling it a "custom report" was overly simplistic. It's an integration that extends phpcs with a custom report that reads input from an environment variable and processes requests from the extension. The primary purpose of the integration is to provide a more granular level of control over the fixes that are applied. As a quick example, this request fixes a single problem at a specific line:column position. It then returns the sections of the file that have changed with the new content so that the extension can apply it.

This is made possible by extending PHP_CodeSniffer\Files\File with my own. One of the neat things I noticed was that addMessage() could be extended and I could reject messages that weren't relevant to the current request. This opened the door to applying single sniffs (performance optimized) and fixing only sections of the file.

By the looks of it, there are already several PHP language servers available.

Sorry, I should have been more clear. A "language server" in this context is a service that provides language features, such as linting and formatting in our case. It communicates with clients using the Language Server Protocol and as such is editor-agnostic. LSP is supported by most editors including Visual Studio Code, Sublime, JetBrains IDEs, vim, and more. You just willed it into existence :smile:

What I am proposing is the creation of a PHPCS Language Server NPM package. Using Microsoft's Node.js LSP server, I can create a service that interfaces with PHPCS via the CLI. It passes the documents via stdin with appropriate arguments and then supplies clients with the diagnostics, inline fixes, and formatting.

Also, refactoring PHPCS to support a single IDE is completely out of the question.

That would be absolute insanity!

I get the impression that in for both proposals, it's about getting a report for only a small portion of the code in a file. Is that correct ?

Taking a step back, the primary purpose of the integration was to provide support for a --range option that does not exist. This option would accept a {startLine}-{endLine} input with an optional column using {line:column}. This gets you the ranged formatting; combining it with --sniffs will get the single-sniff fixes. This seems like it aligns pretty well with PHPCS being a CLI tool first and foremost. As an added bonus, if it supports multiple ranges, it can be used to target specific sections of code.

Files should always be scanned as a whole as sniffs will often look at the larger context of a file. Some errors may be reported on line 1 if they apply to the file as a whole, which is problematic when reports don't take the whole file into account.

While true, if the caller says they only care about problems in a specific range, is it really a issue?

One thing I want to note is that the approach taken in my integration still processes the entire file. By returning prematurely in File::addMessage(), we avoid registering the message and no problems are reported. When fixing, sniffs rely on File::addFixableWarning() and File::addFixableError() returning true when the fix should be applied. Since File::addMessage() returns false for messages out of the --range, the fix is not applied. Thanks to persistence of tokens too, the range won't be affected by changes made by fixes in a single fix cycle. There's still an edge case with multiple cycles, however, limiting ranges to a single fix cycle per call might be reasonable.

PHPCSStandards / PHP_CodeSniffer