dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.07k stars 4.04k forks source link

Compiler server needs a persistent mode #46024

Open jaredpar opened 4 years ago

jaredpar commented 4 years ago

The default behavior of the compiler server is to stay alive waiting for request for 10 minutes. After that time has expired the server will shut down. This 10 minute number was chosen by a highly scientific process back in the Roslyn 1.0 days:

10 minutes represented a nice sweet spot that kept everyone happy.

This universal timeout model though doesn't fit all of our use cases. Consider for example donet watch. As long as the watch process is alive there is an expectation that a compilation should be happening soon and our tooling should optimize for that. To support this the server should ideally be alive for as long as dotnet watch is alive so it's ready to make the next compilation as fast as possible.

Need to explore methods to make the server cooperate better with scenarios like this. Think the simplest model is to add a feature to the server such that it will stay alive until certain processes, like dotnet watch, have exited. That could be considered in parallel with the timeout code so that the server will stay alive until both the timeout has expired and all of the monitor'd processes have exited.

CyrusNajmabadi commented 4 years ago

Anotehr way to do this is to still have the compiler server shutdown after a certain amount of time, but then have the owning process (i.e. dotnet watch) send it a heartbeat of some sort to let it know that it should extend out its deadline for when it should go down.

I've found in the online space this works well and ensures that thigns are designed to go down by default, and they themselves are nto written to stay running indefinitely (which can lead to problems).

jaredpar commented 4 years ago

Note: i'm not firm on the design laid out in the issue. It's more of an initial idea that I had and wanted to vet with a few others. There are some details to work out.

The heart beat approach is another thing I've considered but it adds some new behaviors to the system. Particularly it means that we have to expose the server and it's contracts to the caller to some degree.

Today the server is essentially completely hidden. Developers build and are oblivious as to whether or not the server comes into play or not: it's an implementation detail. There are a few parameters that can be used to control the server, like bypass it entirely or specify the pipename, but that's for advanced scenarios. Exposing the server to customers means we need to essentially expose a lot of our contracts:

  1. Pipe name to connect to: it's very important that this is unique per compiler version (avoid cross chatter) so we implement it by hashing a few binaries and chopping up the hash in an OS specific manner (silly OSX and it's limitations on pipe names). Exposing this would be tricky but necessary to establish communication.
  2. Restart policy: during a build the compiler server will regularly restart by design when certain conditions occur. Ensuring that MSBuild doesn't stall out when communicating with the server during restarts is a tricky dance but it's one heart beat customers would have to replicate.

I think to do this we'd essentially need to offer up an API that teams like dotnet watch could consume to force the heart beat function. Then we have to work out deploying that, making sure it matches compiler versions, etc .... It seems a bit complicated.

That's why thus far I've preferred the model I laid out above. Make the parent process id part of the MSBuild parameter set. That flows into the server and we can handle all of the details. Best of all is that it works across restarts of the server (assuming it's passed for every build command which is a small ask on consumers).

I've found in the online space this works well and ensures that thigns are designed to go down by default, and they themselves are nto written to stay running indefinitely (which can lead to problems).

Agree with this strongly. Waiting on the parent process id seems like a fairly safe thing to do here. There is a small window for a PID re-use race to occur but that seems pretty extreme (at least for my understanding of PID re-use problems).

CyrusNajmabadi commented 4 years ago

Today the server is essentially completely hidden

Ah. That makes sense. And i can see why you'd want to preserve this. Sounds like you're considering lots of options. Good luck finding the best one with these constraints :)