hylo-lang / hylo

The Hylo programming language
https://www.hylo-lang.org
Apache License 2.0
1.19k stars 56 forks source link

Hylo LSP proof-of-concept #1010

Open koliyo opened 11 months ago

koliyo commented 11 months ago

Hello Hylo team!

I am really excited about the Hylo language effort, mainly from the watching the published presentations and listening to the podcasts, which has convinced me that Hylo has very high potential, and if the project and community get enough momentum this could have a large impact on writing safe and performant code.

I wanted to get more familiar with the language itself, as well as design and implementation details of the compiler.

So I thought building an LSP for Hylo was the perfect project to get hands on experience.

I do have a working proof-of-concept implementation of a hylo-lsp server at this point. I have not been able to spend as much time on the actual Hylo compiler API as I originally intended, because I soon realized that the Swift LSP ecosystem was missing some key components to bootstrap such an LSP project.

I found the LanguageServerProtocol projects, which does provide some of the building blocks, but has up until now only been used for client side LSP integration.

So I have spent quite a bit of work to add functionality for LSP server development, tracked in this issue: https://github.com/ChimeHQ/LanguageServerProtocol/issues/7

After working on those parts, I could then return to the hylo specific parts for an hylo-lsp, writen in Swift, and integrating with the Hylo compiler API.

The hylo-lsp repo is here: https://github.com/koliyo/hylo-lsp

And the current implementation is in the feature/wip branch.

The project consists of the following:

I also have a fork of hylo with some minor changes to allow the LSP development.

The current POC LSP implementation has the following functionality:

For the sematic token functionality I am missing AST nodes to build more complete support. Specifically AST nodes for keywords, eg public, fun, etc. I understand these do not need to be stored for normal compilation, but for complete syntax highlighting coverage these would be useful to have. Maybe using some custom parameter as part of AST construction?

Also I have forked the hylo compiler itself with some minor changes, mainly to build hylo compiler as a library, and adding public constructor to SourcePosition.

What are your thoughts on LSP development, have you started some internal effort towards this, or how do you see this work going forward? I would be happy to contribute and continue help developing some parts of the LSP server.

The implementation is currently in a pretty rough WIP state :)

I really hope you continue your effort with Hylo development, and that wider adoption start spreading. It is great that you have a public roadmap on the website as well.

Let me know your thoughts on this!

kyouko-taiga commented 11 months ago

First of all, wow! Thanks a lot for this work.

I think I can confidently speak on behalf of all contributors to say that we're very excited to see this work going forward. I would be happy to provide all the help I can.

For the sematic token functionality I am missing AST nodes to build more complete support.

Almost all declarations have an introducer or introducerSite property that provides the source locations of their introducer keyword. For example, FunctionDecl.introducerSite is the source range of the fun keyword in all function declarations.

Declaration modifiers (e.g., public) and other keywords are represented with a type called SourceRepresentable<T> that's notionally a value of type T (typically the abstract value of the keyword) along with its source locations.

Would you be able to work with these objects alone or do you need proper AST nodes?

koliyo commented 11 months ago

Ok, nice, i will look into using the introducer, introducerSite, and other references. I do not really need it to be explicitly part of the AST, just as long as it is available somewhere. I just needed some pointers on where to look I guess πŸ€—

And to be honest, I don't plan on spending a ton of effort on this. But it was a good project to get in depth knowledge of Hylo, and hopefully a nudge in the right direction to get broader adoption of the language.

What I really want is to actually start using Hylo itself. And I realize we are still in a very early phase. But as soon as the compiler, language design, and stdlib, are mature enough I want to be able to start building some application/library using Hylo.

And IDE integration specifically is to me a really important part of making a language productive in the hands of a developer. As a reference, I do a lot of work in modern, cross-platform, .NET environment with C#, and the Roslyn-based LSP is such a night-and-day differentiator compared to not having that level of IDE support.

Additionally, not just for human developers, AI developers will have great usage of LSP tools for code comprehension and navigation. And Hylo could imo be an extremely powerful platform in this context for building next generation applications.

My thoughts on next steps for this LSP initiative:

  1. Get my forks of the dependencies, mainly LangaugeServerProtocol library, merged back to the upstream repositories.
  2. Make sure the hylo-lsp WIP is cleaned up a bit, and make sure others can build and test it out locally.
  3. End user (developer) artifacts. First step make sure there is a build pipeline with a release artifact, eg a .vsix extension archive that can be installed in VS Code. And at later point this could also be published to extension marketplace.
  4. Hylo organization based governance for the lsp, eg migrate to hylo-lang/hylo-lsp repository(?)
koliyo commented 11 months ago

Regarding keywords, I have resolved a lot of tokens since last time. But I have not been able to get introducer for ProductTypeDecl, is this not available? For VarDecl I was able to use the outer BindingDecl.

Additionally, are ranges for source code comments available?

kyouko-taiga commented 11 months ago

But I have not been able to get introducer for ProductTypeDecl, is this not available?

It seems like we don't have one. Please open an issue. I'll implement it when I find some time.

For VarDecl I was able to use the outer BindingDecl.

That's probably the best way to get an introducer. Note that a BindingDecl may introduce multiple variables; all will have the same introducer.

Additionally, are ranges for source code comments available?

No, comments are simply ignored during tokenization. It may be a little cumbersome to add them to the AST.

Perhaps in the long run we may need a different parser to interact with LSP, to produce a concrete syntax tree rather than an abstract one. IIUC that's what Swift does. (see swift-syntax).

koliyo commented 11 months ago

I have worked a bit with setting up a release workflow for the vscode extension. I have a first test version available now, if anyone is interesting in trying it out

NOTE: The release build only support Mac Silicon (M1 & M2) at the moment.

https://github.com/koliyo/hylo-lsp/releases/tag/v0.5.0

Download the vsix file and install from command line:

code --install-extension ~/Downloads/hylo-lang-0.5.0.vsix

It is also possible to build locally with the script build-and-install-vscode-extension.sh in the hylo-lsp repository.

Current functionality

Semantic token support, for quite a lot different of nodes types at this point

Screenshot 2023-09-29 at 20 17 18

Document symbol list/outline

Screenshot 2023-09-29 at 20 17 45

Jump to symbol/definition

image

Error diagnostics for compilation errors

Screenshot 2023-09-29 at 20 22 38

Of course also lots of things also not working well, it is a very early version πŸ˜…

koliyo commented 11 months ago

I pushed a new version, there were some pretty severe document sync issues in the first release. Getting a lot fewer errors now. Also moved the release to the hylo-vscode-extension repository

https://github.com/koliyo/hylo-vscode-extension/releases/tag/v0.5.3

kyouko-taiga commented 11 months ago

Thanks a lot for this progress report. It looks absolutely awesome!

I'm sorry I didn't have time to test the extension earlier. It seems that my editor is failing to connect to the LSP server, as all requests report a failure. Any idea of a possible step I may have missed?

koliyo commented 10 months ago

Please try again with updated release, I have done more development and pushed some new versions. Are you running on Mac?

koliyo commented 10 months ago

Here is a summary with feedback on LSP developmen for Hylo, that probably needs to be considered going forward. At least in longer time perspective. This includes:

koliyo commented 9 months ago

@kyouko-taiga Have you had a chance to test running the extension again? I would be glad to help get you up and running!

I have just rebuilt the LSP with the recent changes to hylo, so it is up-to-date, LSP version is now v0.6.10.

Also, I have managed to get upstream swift LSP development PR merged in LanguageServerProtocol repository, so it is now much easier to get started with LSP serverside development in swift. See https://github.com/ChimeHQ/LanguageServerProtocol/pull/14

dabrahams commented 9 months ago

I'd very much like to try using the extension with emacs, since that's my primary development environment.

kyouko-taiga commented 9 months ago

Have you had a chance to test running the extension again? I would be glad to help get you up and running!

Haven't had time to try again yet, sorry for the lack of updates. I will try during the weekend and report here.

koliyo commented 8 months ago

I have zero experience with emacs, and I do not think I will be able to allocate time to setting up the emacs integration unfortunately.

The upside of the LSP architecture is that the majority of code and logic is in the portable LSP server, and the IDE integration is a pretty thin interface. The main additional complexity in the VSCode extension is that it does handle dynamically installing and updating the LSP server. This allows:

Hopefully we could find another resource in the hylo community that could help integrating into emacs.

Also, the prototype has most of the functionality I wanted to be able to at least inspect and navigate hylo files more efficiently. There is much to be done in the LSP, but I will probably not be working much on this in the near term. I have a small baby at home and very limited time for sideprojects atm πŸ˜…

Also, it is very much a prototype, and may need to be redesigned in some very fundamental ways going forward, eg in terms of concurrency. But hopefully it can be used as a starting point!

dabrahams commented 8 months ago

I have zero experience with emacs, and I do not think I will be able to allocate time to setting up the emacs integration unfortunately.

Don't worry; I didn't expect you to. Emacs has LSP support built-in.

kyouko-taiga commented 8 months ago

Sorry for the late, late reply @koliyo πŸ˜” (and happy new year).

I finally took the time to give another try at your extension, and let me say that is so cool! All features worked, and frankly I think the extension can probably already help writing actual code in its form. So really well done.

Regarding your remarks:

Compiler must have mechanisms for when source code is not on disk, In the LSP we get in-memory representation of source code that is not saved to disk, and this needs some updated handling. I have a local patch I will make a draft PR with.

I saw that PR pass and some exchanges with @dabrahams. It seems to me that the both of you are currently on top of this issue so I won't interfere, but give me a sign if my help is required.

Some concept of packages, either implicit based on directory structure, eg python, or explicit, eg swift.

We have a notion of module and should be iterating more seriously on its design pretty soon. Stay tuned.

In a nutshell, a module is a unit of code distribution, like a library. I think it's larger than what Java would consider a package, which is more like a directory AFAIU. Changes in a single source file may have far reaching consequences in a single module, but less influence on other modules (or none if the modification doesn't touch an exported API).

My current approach is to compile Hylo stdlib + the current active file. So multifile programs does not work in the LSP atm.

The standard library will be a module of its own pretty soon so probably you won't have to compile it anymore. At least you won't have to type check it anymore. I think that you could already implement this strategy in your own driver, but it's likely simpler to just wait that the feature lands upstream.

IR lowering and analysis isn't blazingly fast either. Some parts are certainly parallelizable, but since the IR is mutating it won't necessarily trivial.

Note that there should never be further diagnostics after we've applied mandatory IR passes. In the driver that is after we called lower(program: program, reportingDiagnosticsTo: &diagnostics). So for the purpose of live editing support, you can probably stop your compilation pipeline at this stage.

The LSP needs to know what files to send to the compiler for parsing/analysis.

My hot take is that it should be fine to just send the file being edited, but perhaps I'm too naive. I'll need your help to understand why my assumptions may be wrong.

Performance overall, and especially type checking.

Working on it πŸ˜…

At first I used the TypedProgram for basically all functions in the LSP, semantic tokens, symbols, diagnostics, etc. But specifically semantic tokens need to be more responsive than the current type checking performance, and we are talking very small programs here.

I think that is the right strategy for semantic highlighting. Building a typed program will be significantly faster once we land a first implementation of modules because you won't have to type check the standard library again. Most of the time you spend type checking "Hello, World!" today is actually time spent compiling Hylo.Int...

That won't fly in the long run, of course, so we have to start considering incremental compilation. I know almost nothing about this technique so I'll have to learn.