ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.59k stars 396 forks source link

[RPC] API for module build notifications #4439

Open rgrinberg opened 3 years ago

rgrinberg commented 3 years ago

This issue describes a current problem with the interaction of LSP (or merlin) and dune in watch mode. It should be possible to fix this problem with RPC.

The issues is as follows:

To work around this issue, I think the property we're looking forward is: whenever a user is editing some module X, dune should notify the client that X has been re-compiled. In the case of LSP, this will prompt the LSP server to type check the source of that module again and post an up to date list of errors.

We can go about it in two ways:

The first one is simpler, while the second one is more efficient. I'm inclined to go with the second one as $ dune build @all in a workspace with 1k modules would trigger 1k notifications. That seems excessive and 1k modules is certainly not an upper bound.

ghost commented 3 years ago

/cc @cwong-ocaml

rgrinberg commented 2 years ago

This was discussed in the recent dune meeting. @cwong-ocaml what was the workaround used at jst? I don't recall anymore.

rgrinberg commented 2 years ago

I have a new proposal to solve this problem.

We an rpc request that is morally equivalent to:

val follow_build : [`Module_path of string] -> unit Fiber.t

This call will do two things:

On the lsp side, we will loop follow_build on all open module sources and trigger merlin to type check the module whenever the request completes.

@voodoos does this make sense to you?

cc @cwong-ocaml, @jeremiedimino who were present at the meeting and were to interested in how this would work at jst.

voodoos commented 2 years ago

This sounds good. This way we offload the dependency analysis to Dune: the cmi of a module would be rebuild by Dune if the source of that module changed or any of it's dependencies did.

rgrinberg commented 2 years ago

@snowleopard to implement this feature, I need some additional functionality from Build_system. A way for Build_system to notify when certain when targets have been rebuilt. I'm thinking something like:

val on_rebuild : Path.t -> (unit -> [`Success | `Fail] Fiber.t) -> [`Unsubscribe of (unit -> unit)]

But I'm open to other proposals. What do you think about this?

snowleopard commented 2 years ago

@rgrinberg I can't quite understand what the second argument means. Did you mean something like this instead?

val on_rebuild : Path.t -> ([`Success | `Fail] -> unit Fiber.t) -> [`Unsubscribe of (unit -> unit)]
snowleopard commented 2 years ago

Assuming the above is correct, I would probably generalise it slightly, to make it easier to subscribe and unsubscribe in one go:

val on_rebuild : (Path.t -> bool) -> f:(Path.t -> [`Success | `Failure] -> unit Fiber.t) -> [`Unsubscribe of (unit -> unit)]
rgrinberg commented 2 years ago

Did you mean something like this instead?

Yes I did.

I would probably generalise it slightly, to make it easier to subscribe and unsubscribe in one go:

This needs a bit of clarification for me. Is the idea to do something like:

let interested_paths = ref Path.Set.empty in
let (`Unsubscribe unsub) = Build_system.on_rebuild (fun p -> Path.Set.mem !intersted_path p) ~f in
...
let on_client_open_file file = interested_paths := Path.Set.add !interested_paths file in
...

So the first argument acts as a set of the paths that I'm intersted in?

snowleopard commented 2 years ago

So the first argument acts as a set of the paths that I'm intersted in?

Yes, exactly. From the point of view of Build_system, it's as easy to support this more generalised version as the original one, but the user now has a few more options:

rgrinberg commented 2 years ago

Okay, that sounds good to me. @jeremiedimino wdyt?

rgrinberg commented 2 years ago

Actually, this api doesn't have a good performance profile. Consider n subscribers with m build targets. Build_system will need to make n * m calls to the callback passed to on_rebuild. While the api that I originally proposed would only require a map lookup for every m and a traversal of only the subscribers that apply to this file.

snowleopard commented 2 years ago

That's a good point!

If the API allows the user to add a subscriber only to a specific Path.t then it's possible to put all subscribers in a Path.Map.t. This can be bad if m (the number of paths in which we are interested) is large because then we have to keep a large map in memory. But I think we expect that m is typically going to be small.

With the API that I proposed, we will need to store subscribers in a list and traverse that list on completion of every build action. This can be bad if n (the number of subscribers) is large.

So, yes, I can see how we can prefer one or the other depending on our assumptions about m and n. Perhaps, in the end we might need both, but it seems better to start with simpler Path.t-based subscribers because they fit well the described use-case.

ghost commented 2 years ago

Given the conversation the simpler API seems fine to me.

As a side note, I've seen more polymorphic variants used just as documentations creep in in recent PRs. Such as the Unsubscribe here. These are light and convenient, however they do have a runtime cost. As we want Dune to scale, we need to be mindful about these extra allocations because they all add up. Here we could use an Unsubscribe.t type instead of [`Unsibscribe of (unit -> unit)]. It would be just as clear and would avoid a bunch of non-necessary allocations: the one for the polymorphic variant and the one for the closure.

snowleopard commented 2 years ago

Agreed about polymorphic variants: defining a separate Unsubscribe module seems better.

P.S.: I just remembered that we have a cute monoid that could be used here and now I have to share this knowledge :D

It's possible to do this:

module Unsubscribe = Monoid.Endofunction.Left (Unit)

and then use reduce : Unsubscribe.t list -> Unsubscribe.t to unsubscribe from everything in one go.

ddickstein commented 1 year ago

I'd like to propose broadening the proposed on_rebuild function to the following:

module Status : sig
  type t =
    | Success
    | Fail of { errors : Diagnostic.t list }
    | Rebuilding of { dependency_errors : Diagnostic.t list }
    | Not_building (* File is not in the transitive closure of current build targets *)
end
val on_rebuild : Path.t -> (Status.t -> unit Fiber.t) -> Unsubscribe.t

This can be useful beyond the LSP use case that @rgrinberg cited in his original post - we can use this to surface per-file build status indicators, for example. It's particularly helpful to know whether an issue was encountered in a dependency of the file that is preventing it from getting rebuilt so that we can jump to that location to resolve it. I'm sure there are plenty of other use cases that I'm not imagining, but generally the more information we can surface about the state of the build for a given file the more useful tooling we can build on it.'

EDIT: The specific use I have in mind for build-status indicators is to know whether to unstale diagnostics and whether we can expect Merlin to work properly in the file. So if something about this API would be insufficient for those purposes, we should discuss this further before this is built.