Closed mlin closed 5 years ago
w.r.t. LSP, we’ll likely use workspace/didChangeWatchedFiles capability, where the client periodically sends the list of “watched” files to the server. Then we set up each client to watch for all .wdl files in the workspace, and this will inform the server which file paths to take into account.
Happy to report our initial implementation of in-memory parsing for LSP: https://github.com/broadinstitute/wdl-ide/blob/bf064a0c8f093e4d1c984e561f0bd1eea1009aa0/wdl-language-support/server/server.py#L135-L169
The server:
1) Listens to the initial textDocument/didOpen
notification from the client, and caches
the received contents of the document into a Dict[uri, List[line]].
2) Listens to textDocument/didChange
notifications from the client, which send incremental
updates for the document. The server adjusts dict[uri] text accordingly on every update. Then, it runs WDL.load(doc.uri, source_text=doc.text)
asynchronously with a debounce Timer, so that the new content gets re-validated only when there were no more changes sent over a given timeout (1 sec currently).
3) Listens to the final textDocument/didClose
notification from the client, and removes the cached content from memory then.
It turns out that pygls
already maintains a registry of open documents on the server, which we can access easily with ls.workspace.get_document(uri).source
I've now also enabled auto-resolution of WDL paths [1]:
1) Initially, the server takes a WDL URI and finds "workspace root" folders that are prefixes of this URI. It then enumerates all folders containing *.wdl
files within those roots. Finally, it caches this set of folders for each root.
2) Upon this and subsequent invocations, the server passes the relevant list of folders from (1) as path
argument to WDL.load()
. This results in successful resolution of all imports.
3) The client watches for created/deleted *.wdl
files in all workspaces, and sends such changes to the server. The server then updates the relevant cache of folders from (1).
As a result, we've achieved efficient and automatic resolution of WDL imports. In the Optimus pipeline, initial path resolution takes < 20 ms. In a bigger project with tons of node_modules
folders, it takes ~ 0.8 sec. Cached resolution takes << 1 ms in both cases.
@rexwangcc @mohawktrail
@dinvlad I'm thinking about making a new version of WDL.load()
into which one could pass an async function with this signature:
async def read_source(uri: str, path: List[str], importer_uri: Optional[str]) -> str:
...
where
path
is the user-provided search directories if anyimporter_uri
is the filename/URI of the document from which the requested document is being imported if anyThe default implementation of course will just treat uri as a filename, open and read it. But you could override it to intercept requests for files with unsaved edits in the IDE.
Naturally I gravitated to this because of how it punts the aforementioned complications to you :sweat_smile:. Do you think it would work?
Thanks - so if I understand it correctly, this will allow us to do something like:
WDL.load(
wdl_uri,
path = paths,
read_source = lambda uri: ls.workspace.get_document(uri).source,
)
In this particular example, we just ignore path
and importer_uri
args to read_source()
, because our resolution function only depends on the uri
.
We also assume that read_source()
will be applied to all URIs, including wdl_uri
in the example (so we no longer have to pass source_text
).
If the above holds, then this seems like a good match for our needs!
WDL.load()
accepts asource_text
string as an alternative to reading from the filename/URI, which is useful for Language Server Protocol to analyze code not necessarily written to disk. But if the WDL imports other files, right now they still need to be written to disk.We'd like there to be a way to pass in multiple in-memory source files which may import amongst themselves. This might take the form of a
Dict[str,str]
from filenames to source texts; or some kind of abstractSourceLoader
interface, which could enveloppath
and the currentimport_uri
callback. Make it asynchronous while we're at it??Subtleties: