scalameta / metals-feature-requests

Issue tracker for Metals feature requests
37 stars 4 forks source link

Support for non-file URIs #2

Closed felixfbecker closed 4 years ago

felixfbecker commented 5 years ago

Hi 👋 and congrats on all the progress metals has made so far. It's really impressive!

I peeked around the code a bit, but couldn't answer the question whether metals supports URIs other than file:. file: URIs are usually used by editors for on-disk files, but when deploying a language server as a service for online code viewers, web IDEs, code intelligence backend for doc pages with code snippets etc the language server typically runs in an isolated container that does not have the files checked out to disk.

In that scenario, instead of file:// URLs, at Sourcegraph we give the language server http:// URLs that point to endpoints where the text documents can be retrieved with a simple GET request.

Does metals have the infrastructure to handle URIs other than file:?

olafurpg commented 5 years ago

Hei! Thanks for opening this issue.

I'm optimistic it would be possible to refactor the Metals codebase to play nicely with non-file URIs through JDK nio.FileSystem. However, I'm afraid it would not solve the problem since our dependencies like the sbt build tool and zinc, the incremental compiler, would also need to work with non-file URIs. I estimate it would require a big effort to get there and it's unlikely to get prioritized by us in the near future.

felixfbecker commented 5 years ago

Hmm, so sbt and zinc all read directly from disk? Are they used as libraries or shelled into as a CLI tool? I would imagine that having "injectable" FileSystem interfaces would be good for unit testing too

olafurpg commented 5 years ago

Yes, I don't think it will be possible to provide sbt or zinc a custom file system. Do you know how works for Java? Is it possible to run Maven or Gradle with virtual file systems?

felixfbecker commented 5 years ago

Our own Java language server https://github.com/sourcegraph/java-langserver has support for getting files from a virtual file system and supports both Maven and Gradle. We are also looking into making https://github.com/eclipse/eclipse.jdt.ls work. Maybe @beyang can give more details?

beyang commented 5 years ago

The Java language server (eclipse.jdt.ls) makes every file access through an interface like IFile, which makes it easy to add new virtual file implementations (like files fetched remotely).

I'm guessing you have a similar set of interfaces in Metals, at least at the compiler level.

For build systems (like sbt), there are two approaches:

olafurpg commented 5 years ago

Metals uses a similar architecture as "index-while-building" in xcode (see https://www.youtube.com/watch?v=jGJhnIT-D2M). We delegate compilation to the build tool and read SemanticDB files from disk produced by that batch compilation.

The Scala compiler supports virtual file systems, and I'm not sure but maybe also zinc (the incremental compiler). However, even if sbt itself supported a VFS, all the installed sbt plugins would also need to support the VFS and that's outside our control.

I think the best bet is to run the build script in a local FS as you mention @beyang. At that point, what are the benefits of the VFS if you already run parts of the integration on a regular FS?

beyang commented 5 years ago

If the build tools assume local FS, then it makes sense to continue assuming local filesystem access. Sourcegraph's use case requires hitting non-local files over HTTP, but it probably makes sense to support this as an outer layer wrapping the core part of the language server. Feel free to close this (I think more specific questions about handling of HTTP URIs would make sense as a separate issue).

Thanks for taking the time to discuss this!

olafurpg commented 5 years ago

@beyang Fair enough, I will close this ticket then as "won't fix" I'm afraid :/ I wish I could give a better response but migrating the whole sbt plugin ecosystem to support a VFS is likely an hopeless task. Thanks for the discussion!

felixfbecker commented 5 years ago

@olafurpg could the language server fetch the files if the root URI is a non-file URI and write them to disk? That way, whether a virtual FS is used or not is just an implementation detail, from the client's perspective non-file URIs are supported

olafurpg commented 5 years ago

@felixfbecker that's definitely on the table, what would we need to do on the Metals side to fetch all the files? Would requests use http URIs? We could translate them to local file URIs before processing.

felixfbecker commented 5 years ago

Yeah, the most common use case is HTTP URLs (for both rootUri and text document URIs). The HTTP URLs would point to endpoints where the resource can be requested with a simple GET request.

For example, the root URI could be https://sourcegraph.com/github.com/scalameta/metals/-/raw/ and you could fetch the files you need by walking directories, or alternatively you could also fetch the root (or any subdirectory) as an archive by simply sending an Accept: application/zip (or application/x-tar) header. Here's an example of a text document under that root URI: https://sourcegraph.com/github.com/scalameta/metals/-/raw/sbt-metals/src/main/scala/scala/meta/metals/MetalsPlugin.scala You can determine the relative file path by resolving the text document URL against the root URL.

Would that work?

olafurpg commented 5 years ago

Unzipping an archive file and converting http:// URIs to local file URIs should not require a big refactoring on our side. I've re-opened this ticket to be about

Is it OK if I write integration tests against the sourcegraph.com URLs? I could create a repository with a tiny sbt build and assert that build import and goto definition works as usual.

I won't likely have time to look into this in depth until later part January or maybe February.

keegancsmith commented 5 years ago

Is it OK if I write integration tests against the sourcegraph.com URLs? I could create a repository with a tiny sbt build and assert that build import and goto definition works as usual.

Yes that is totally fine :) @felixfbecker do you have an example initialize rootPath which is https?

Edit: Whoops I see a reply above includes one.

olafurpg commented 5 years ago

Does Sourcegraph handle HTTP paths to files that don't exist in the git repository? For example, sometimes the build generates source files that are ignored by git. Also, to jump to library dependency sources, Metals creates text files under $WORKSPACE/.metals/readonly (it would be cool to cross-link git repos for Sourcegraph but I estimate that requires non-trivial custom development).

olafurpg commented 5 years ago

Concretely, what happens if Metals returns in a textDocument/definition response a HTTP URI to a file under https://sourcegraph.com/github.com/scalameta/metals/-/raw/path/to/generated/file.scala the that doesn't exist in the original git repo?

felixfbecker commented 5 years ago

Concretely, what happens if Metals returns in a textDocument/definition response a HTTP URI to a file under https://sourcegraph.com/github.com/scalameta/metals/-/raw/path/to/generated/file.scala the that doesn't exist in the original git repo?

When you request that raw API URL, it would return a 404 Not Found. When the language server returns that as the definition location, Sourcegraph will construct a UI "blob" URL (i.e. /blob/ instead of /raw/) and navigate the browser to that, which would show a 404 file not found page to the user. It's okay if this is a known limitation, for example this will happen for TypeScript too if the project does not publish declaration source maps.

(it would be cool to cross-link git repos for Sourcegraph but I estimate that requires non-trivial custom development)

What TypeScript does is recognize when a result URI is out-of-workspace (i.e. only on disk, but not in the git repo and therefor not accessible through the raw API under that URL) and rewrite it to a raw API URL of the repository it actually belongs to (if it can figure it out). How that works is language-specific, but for TypeScript it checks the package.json of the package in node_modules for a repository field, then builds a raw API URL for that file in the repository of the package.

olafurpg commented 5 years ago

Would it be possible to implement some kind of "middleware" language server that unzips the directory during initialize and converts between http and file URIs in requests? The logic seems generic enough that it might be re-usable for different servers.

olafurpg commented 5 years ago

@felixfbecker Any thoughts on https://github.com/scalameta/metals/issues/402#issuecomment-454156685 ? I was gonna take a stab at this but was leaning towards implementing it outside the Metals repo because IIUC it doesn't need to be server-specific.

keegancsmith commented 5 years ago

We have something along those lines at https://github.com/sourcegraph/lsp-adapter

However, it needs to be updated to follow the scheme described above (uses a deprecated method for fetching remote files).

felixfbecker commented 5 years ago

We used the language-agnostic middleware in the past with lsp-adapter (see above), but moved away from it towards building support into language servers directly. One of the reasons for that is that more advanced logic like cross-repository go-to-definition or find-references often requires further hooking into the file fetching, and that there are other language-specific optimisations that can be done (only writing needed file types to disk, considering language-specific project boundaries, etc). Fetching an archive over HTTP and extracting it and mapping URIs actually doesn't take that much code, so it is easier to just implement it in the language server. And since URIs are part of the LSP spec, it makes sense to have first-class support for different schemes in language servers and not requiring to launch them in a wrapper process.

olafurpg commented 5 years ago

Thanks for the information. I understand why baking this logic directly into the language server is necessary to provide the best possible UX. However, it also increases the barrier of entry for language server developers to get up and running on Sourcegraph. I would love to support Metals on Sourcegraph but it's difficult for me to prioritize this amount of custom work when there are other important LSP features on the queue.

Would it be possible to go with the adapter approach to start with? Just to get off the ground, we can then later discuss how to improve the UX with nice features like cross-repo navigation and minimizing the number of downloaded files.

felixfbecker commented 5 years ago

Sure, that works too

olafurpg commented 5 years ago

Cool. I'm wrapping up a v0.4 release this week, I would love to get Metals working on Sourcegraph for the next v0.5 milestone 👍

felixfbecker commented 5 years ago

Awesome! Would this wrapper be distributed in an official Docker images? Or have to be run separate? Would be great if it didn't add complexity to how to run it but worked out of the box

olafurpg commented 5 years ago

I'm not sure yet how it will right, I think we can make it as easy to launch as the main server and keep it in the same JVM process.

olafurpg commented 4 years ago

Judging by https://about.sourcegraph.com/blog/code-intelligence-with-lsif it looks like this issue may no longer be of interest. Closing due to inactivity, feel free to reopen.