[New Feature]: Implement a compiler server

manuranga commented 1 year ago

Currently each completion starts in a new process. Reusing the same JVM as daemon process should improve performance.

Scope

Reusing the same server for LS will be considered under a separate issue.
~~At this stage a single server process will only handle a single project.~~
Current implementation will not be removed. Other CLI commands will keep using the existing implementation. If the communication to the server fails, CLI will fallback to the current implementation.

Changes to the CLI

Name daemon is picked in the CLI to avoid any confusion with Ballerina services.

bal build --no-daemon - This will build the project without using the daemon.
bal daemon start - This will start the daemon manually. Normally this is not needed as the daemon will be started automatically when needed.
bal daemon stop - This will stop the daemon manually.
bal daemon status - This will show the status of the daemon.

Communication

Server will start on a random port in a predetermined range. This port will be written to a file in the project build directory.

         handshake: validate server version
CLI --------------------------------------------> Compiler Server (Deamon)
    <--------------------------------------------

       request: build/test/run for a given root
     ------------------------------------------->
     <-------------------------------------------

Complication

~~We only support a single project because the existing compiler is not designed to run multiple projects in the same JVM. To enable it we may need to get rid of any global state.~~
There is a risk of accumulating stale caches. We need to monitor the build directory and clean server if it is deleted. Maybe we need a time-to-live as well.
Ballerina listeners can't currently be started in only the loopback interface. This is a security issue. We may need to fix this if we are to use Ballerina for the backend.

hasithaa commented 1 year ago

How to handle multiple projects. Assume, I have 3 active projects, and I want to build these three ones after the other. How the demon behaves in this situation?

In LS implementation we can handle multiple projects IIRC. @IMS94 Can you please comment, on how we handle this?

sameerajayasoma commented 1 year ago

Multiple package support is not implemented in the Project API level. I.e. multi-package projects. I assume the lang-server uses a map of Projects and load the Project instances based on the file path.

IMS94 commented 1 year ago

Yes, we don't support multi package projects at Project API level. But as @sameerajayasoma said, LS keeps a map of projects (user can have several Ballerina projects in the current VSCode workspace) and use the file path to differentiate the project when required.

We do that in BallerinaWorkspaceManager class. I think we can use a single process to support multiple projects at the compile server level using a similar approach. We have to make sure that no 2 compilations run for the same project at the same time. In LS we have used locks per project to achieve that.

Since the client will communicate with the compiler server over a port, I think we will need a way to do the following:

Start a compilation for a given project
Cancel the compilation of a project (If user press Ctrl +C)
Check compilation status of a project (already running or not) - (the project root path can be the identifier) or away to get notified when compilation is finished.
- Will the request sent from client be blocked until the compilation completes?

Above things are in addition to checking the compile server status.

Impact for Language Server Currently, LS keeps track of user's changes to the files in memory (there's a syncing protocol defined in language server protocol for that). Because of that, will LS be able to benefit of a compiler server (since the in-memory changes have to be reflected in the compile server to perform the compilation correctly) ?

manuranga commented 1 year ago

Thanks @IMS94 I think having a single process will simplify things. I will use that approach.

Cancel the compilation of a project

Yes, I think we should use a duplex communication channel (similar to LS protocol), it can be TCP socket or a WebSocket instead of REST. Then CLI can send additional messages (such as Ctrl+C) while the compilation is happening.

Will the request sent from client be blocked until the compilation completes?

Yes I think it makes sense to reject or block parallel compilations.

Impact for Language Server

I as pointed in scope section, eventual goal is to merge LS and this compiler server. At that point we should find a code level abstraction that preserve the current behavior for LS side (ie: compile from in memory) while also providing CLI behavior of compiling form disk.

manuranga commented 1 year ago

Caveat regarding the single process: I think lower passes (such as codeGen) is not tested in multi-package mode, since LS only needs pre-bir-gen side. So there might be race-conditions in that prat of the compiler.

IMS94 commented 1 year ago

@manuranga yes, we don't run lower passes in LS. I think those phases are required when the command is run via the CLI. From LS side we have few more concerns/scenarios:

Users can have unsaved documents open in VSCode. LS will have the updated (unsaved) version in memory and will compile that version. But, if a user runs bal build, the version in disk should be used. Therefore, will we be able to facilitate both LS and CLI via a single compile server process?
Once the compilation is done, LS/Semantic API need to access the syntax tree and PackageInstance (AST with semantic information). How can that information be shared? Via another request/response by serializing?

manuranga commented 1 year ago

Users can have unsaved documents open in VSCode.

I am not planing this level in integration yet, but when we do, we'll have to refactor code to see different open files depending on the path (CLI vs VSCode)

2. Sorry I didn't fully understand the question. But AST not sharable between CLI and VSCode (if there are open files that are not saved), is that what you are referring ?

manuranga commented 1 year ago

@jclark suggested trying to communicate between the CLI and the compiler server using the LS protocol as well. I am trying this out.

manuranga commented 1 year ago

Progress

Currently I have implemented a CLI that can talk to the backend via the language server protocol. I managed to send the build command as workspace/executeCommand instruction and execute it in the backend.

Normal bal build takes 5450 ms.
First run of the compiler server approach takes 3000 ms. (out of which backend is 1500ms) Didn't expect this to be faster. I think it's only faster because I am skipping some steps.
Subsequent runs of the compiler server approach takes 1400 ms. (out of which backend is 200ms)

Plan

I need to figure out and implement the missing steps of compiler server backend (lookup central, create fat jar).
To support run we need to send program output incrementally. I think we can use LSP's new $/progress feature for this.
It is still too slow. I suspect cli is too slow since it still has all the classes. We may need to write a thinner CLI. (James suggested maybe we should look into a non-JVM language). But need to profile before doing anything.
Another option is to switch paths and focus on the VSCode. Challenge 1 : Need to learn VSCode extension code. Challenge 2 : Where to provide args? Challenge 3 : We can provide run command off of the in memory buffers, but build needs to see on disk files. This is difficult to achieve under the current project API since it sees a virtual file system. We could autosave. Advantage 1 : This may be more useful for the average user, assuming they use VSCode more often than CLI. Advantage 2 : I don't need to worry about LS process management or writing a client.

NipunaRanasinghe commented 1 year ago

@manuranga I prefer having a compiler sever exposed via the CLI (instead of focusing only on the VSCode experience), because probably we can expect considerable performance improvements of the debug server with the proposed CLI approach.

To provide a bit of context, right now the debug server depends on the bal run command to launch the debuggee program. Therefore, if the user wants to rerun a debug session even without doing any code changes, the subsequent sessions will still go through all the compilation phases again and again, which could be avoided with this compiler sever approach.

So once you have a working version of the compiler server, I'm happy to try integrating the same with the debug server and working on the possible improvement points.

manuranga commented 1 year ago

Ballerina extension has many UIs where a user can start a Ballerina application.

But they all seem to come down to ether 1) Runing bal command form VSCode 2) Running the Debug Server (DS) which in turn runs the bal command

Finally the compiled program is executed as a yet another process.

Solution A) Make the bal command use a background server, and not touch the VSCode extension. Pro: No need to for me to touch the VSCode extension code. Con: Add an additional layer, namely, CLI calling Compiler Server. Currently the added latency is too much for this approach to be an improvement over existing (2-3s). Maybe possible to improve using a thinner approach.

Solution B) Compile inside the exiting LS process in above case 1, and compile inside DS in case 2. Pro: Faster Con: I have to touch both LS and DS

@NipunaRanasinghe and others, I currently think solution B is the way to go. wdyt?

manuranga commented 1 year ago

@jclark pointed out that for above solution 2 to work I need to increase the lifetime of DS. AFAIK we start a DS every time use a start a new debugging session. Please confirm @NipunaRanasinghe. Does it make sense to keep a DS running and only connect to it for each session?

Solution C) Compile inside the exiting LS process in above case 1, DS talks to LS in case 2.

Solution D) Have a separate Build Server, make LS, DS, CLI all talk to it. ( @hevayo suggested something along this line: https://build-server-protocol.github.io)

NipunaRanasinghe commented 1 year ago

@jclark pointed out that for above solution 2 to work I need to increase the lifetime of DS. AFAIK we start a DS every time use a start a new debugging session. Please confirm @NipunaRanasinghe. Does it make sense to keep a DS running and only connect to it for each session?

@manuranga yes true. Right now we use the single session mode approach for our DS, mainly to avoid complexities in multi session mode (e.g. having to handle multiple client sessions and remote VMs at once, requirement to implement the whole DS in a concurrent-safe way).

so for solution B, we will require a major rewrite in the DS side.
for solution C, we will have to open another connection with LS (per session?), which will make DS to be depended on LS (which is okay assuming that LS is guaranteed to be always running and stable)
regarding solution D, isn't it somewhat similar to solution A? (apologies If I'm missing something obvious) Because if we have a separate build server and make CLI talk to it, in LS and DS we should be able to keep using CLI commands (which will talk to the build server similar to option A) right?
on a side note, IMO it might be better to look for a solution which will abstract this compilation caching server as a layer coupled with CLI (instead of implementing the same in each component like LS and DS), which will become more scalable incase we'll have more use cases similar to LS and DS in the future.

hevayo commented 1 year ago

@manuranga we did a PoC sometime back to see if we can use LS to build the executable the blocker we faced was that the Project API not designed to handle syntax node analysis tasks again after codegen. Have you notice the same or if so we might have to solve that before hand. Otherwise once you do the codegen you will not be able to reuse the same project in LS.

manuranga commented 1 year ago

regarding solution D, isn't it somewhat similar to solution A? (apologies If I'm missing something obvious) Because if we have a separate build server and make CLI talk to it, in LS and DS we should be able to keep using CLI commands (which will talk to the build server similar to option A) right?

In Solution A, the CLI is not long-running. It has to be started each time which adds latency. Solution A will only be feasible after significant changes to CLI since current one is too slow. But we get to keep the CLI abstraction. In Solution D, LS and DS has to communicate with a per-running Build Server via TCP.

@manuranga we did a PoC sometime back to see if we can use LS to build the executable the blocker we faced was that the Project API not designed to handle syntax node analysis tasks again after codegen. Have you notice the same or if so we might have to solve that before hand. Otherwise once you do the codegen you will not be able to reuse the same project in LS.

Haven't noticed yet, will keep an eye. If someone remembers what exactly broke it would be helpful.

After looking at above, I think Solution C is the lowest cost, others will take 1 month at least. Since getting something released is high priority, my current plan is to do Solution C case 1 as the first stage, test and release first. We should figure out the plan for next stages.

manuranga commented 1 year ago

I am trying to find a place to show a stop button for long running processors (such as services). I couldn't find a good contribution-point. Only options I can think of is to add a whole panel, instead using built-in ᴏᴜᴛᴘᴜᴛ panel. This is bit of extra work but looks like it's needed.

@gigara Does that sound Ok?

gigara commented 1 year ago

@manuranga How about Progress notification?

manuranga commented 1 year ago

Thanks @gigara, that actually gave me even easier idea. I added a new command to stop a running program, Ballerina: Stop. Maybe I can even link it form the output window.

manuranga commented 1 year ago

@jclark raised a point on CLI inputs and CLI specific features (such as reading a password). I just check and it doesn't work in the current debugger either, but in current Run it does. DAP spec gives a way for the debugee to run in a CLI. We could do something slimier. Should it be opt-in or the default behavior?

Another option is to emulate user inputs by sending them to Server over LSP, where Server will feed them into debugee via the pipe. Under this approach we still can't support full CLI features (such as reading passwords) and it's more work for the extension side.

We also discussed the possibility of creating Run as another (long running) debugger. In this case we will get the default debug UI for free.

To Summarize: Approach A-1: Trigger via the extension. Run as a child of server. Send inputs via LSP. Approach A-2: Trigger via the extension. Run as a child of VSCode. CLI features are supported. Approach B-1: Trigger via a Debug Adapter. Run as a child of server. Inputs not supported. Approach B-2: Trigger via a Debug Adapter. Run as a child of VSCode. CLI features are supported.

NipunaRanasinghe commented 1 year ago

@jclark raised a point on CLI inputs and CLI specific features (such as reading a password). I just check and it doesn't work in the current debugger either, but in current Run it does. DAP spec gives a way for the debugee to run in a CLI. We could do something slimier. Should it be opt-in or the default behavior?

@manuranga We have already implemented this run-in-terminal feature in DS to support CLI inputs. (For instructions refer to the terminal attribute in Ballerina launch.json configurations documentation). However, currently the user has to opt-in for this (which is the common behaviour in other language debuggers as well), but we can easily make it the default behaviour if we want to.

Another option is to emulate user inputs by sending them to Server over LSP, where Server will feed them into debugee via the pipe. Under this approach we still can't support full CLI features (such as reading passwords) and it's more work for the extension side.

If we can use the above run-in-terminal feature, do we still need to think of this path?

We also discussed the possibility of creating Run as another (long running) debugger. In this case we will get the default debug UI for free.

IIUC this is equivalent to running the debugger without having any breakpoints, right? Looks okay but we may need to check on any performance impacts of always running the JVM in debug mode (https://stackoverflow.com/questions/3722841/side-effects-of-running-the-jvm-in-debug-mode)

manuranga commented 1 year ago

We have already implemented this

Great. Opt-in is good.

If we can use the above run-in-terminal feature, do we still need to think of this path?

This is to support input in the default case. Of course we can decide not to support inputs in default case and only support it in run-in-terminal.

IIUC this is equivalent to running the debugger without having any breakpoints

No. The idea is to show as a debugger in the UI, but actually run java normally. This is just to get the floating toolbar appear in the UI.

manuranga commented 1 year ago

Hi @NipunaRanasinghe and others,

As the next iteration, my plan is to convert current implementation into a debugger. I have experimented with DebugAdapterInlineImplementation, it is working. I am using the Run Without Debugging feature (which currently just debugs) to trigger fast-run now. Regular Debug still goes via existing path. I am planing to rewire Run code lens to also trigger Run Without Debugging command.

This gives me the floating debugging button toolbar. However I couldn't not figure out how to disable Pause button in the toolbar. I am setting the capabilities to say I don't support supportSuspendDebuggee but it still shows up.

NipunaRanasinghe commented 1 year ago

@manuranga apologies for the late response.

Hi @NipunaRanasinghe and others,

As the next iteration, my plan is to convert current implementation into a debugger. I have experimented with DebugAdapterInlineImplementation, it is working. I am using the Run Without Debugging feature (which currently just debugs) to trigger fast-run now. Regular Debug still goes via existing path. I am planing to rewire Run code lens to also trigger Run Without Debugging command.

I think this approach will be more promising as we'll have more control over the process we are running, with the support of DAP capabilities.

This gives me the floating debugging button toolbar. However I couldn't not figure out how to disable Pause button in the toolbar. I am setting the capabilities to say I don't support supportSuspendDebuggee but it still shows up.

The supportSuspendDebuggee flag actually indicates whether the debuggee should stay suspended(or resume the execution) when the debug server gets disconnected. It seems DAP doesn't support disabling the pause functionality via the capability registration atm. Probably we can open an ticket in DAP Repository to verify it with the maintainers.

manuranga commented 1 year ago

I have restarted working on this. Currently PackageCompilation gets modified when passed JBallerinaBackend. This creates issues since Language Server expect it to be in the pre-code-gen state.

I tried cloning PackageCompilation before passing to JBallerinaBackend. It was difficult to figure out out what part of the object to clone vs not. It seems even if I run the JBallerinaBackend on the clone, it has some effects on the Language Server. I created following diagram to understand what parts need to be cloned.

PackageCompilation instance

Since this path turned out to be difficult, I tried something else. I marked the root package's modules as modified right after running JBallerinaBackend. So far it seems to work in unit tests, need to test with complex scenario such as compiler plugins.

manuranga commented 1 year ago

In today's call with @sameerajayasoma we summarized the solution in to 3 approaches Q: How to keep semantic api working after desugar. 1) Make sure the tree is always in a valid state, even after desugar. 2) Clone the tree before desugar. Use one for desugar and other for semantic api. 3) Throw away the tree after desugar and start from a new tree for semantic api.

Currently I am exploring 3.

manuranga commented 12 months ago

I tried the approach 3. It solved some issues I had with before, but some Semantic APIs are not working. pkg Found out cloned BLangPakage is missing a link to top level nodes. Looking into it now.

manuranga commented 12 months ago

Approach 3 seems promising, after changing to invaliding each document, instead the whole module. Simple service works under run fast now. In the process of creating unit tests for this.

Still having issues with multi module projects.

sameerajayasoma commented 11 months ago

Looks good. What is the percentage run improvement with FastRun?

manuranga commented 11 months ago

nBallerina project run Cold start CLI = 2m 8s Warm start CLI = 7s Fast run = 3.8s

hello world run Cold start CLI = 24s Warm start CLI = 1.13s Fast run = 0.2s

jclark commented 11 months ago

What exactly is the difference between cold and warm start?

manuranga commented 11 months ago

First build or after bal clean it's cold. Subsequent builds are warm up to 24h. In 2201.8.4 the warm CLI times are significantly improved.

manuranga commented 10 months ago

Multi-module projects work if I remove the shrinkDocumnet fix, which is a temporary fix for OOM issue when working with healthcare libraries. It removes source as soon as each module is code generated. But that means Fast Run feature crashes the LS when working with healthcare libraries. There is a chance this will be fixed by the other fix for the OOM issue, but it is not yet released.

ballerina-platform / ballerina-lang