julia-vscode / SymbolServer.jl

Other
23 stars 30 forks source link

Cloud hosting #200

Open ZacLN opened 3 years ago

ZacLN commented 3 years ago

For now, a couple of questions:

davidanthoff commented 3 years ago

One simple way to start would be that we use https://azure.microsoft.com/en-us/services/app-service/static/ for hosting, and I could manually run the initial indexing on my machine, which is pretty beefy. I think as a short term solution that might be the fastest way that allows us to test things out.

I think once we have that figured out we can think about more long term solutions. One idea had been to host this on juliahub.com, I think?

davidanthoff commented 3 years ago

Ok, that was easy :) The content of https://github.com/julia-vscode/symbolcache is now just available at https://symbolcache.julia-vscode.org/, supposedly completely geo-replicated etc.

ZacLN commented 3 years ago

Fantastic! I'll pull together a script to cache the general registry.

Relevant to this - do you know of anywhere that ranks Julia packages by use? This would ideally be by the number of other packages that use it as a dependency, but general popularity could be used to proxy this.

pfitzseb commented 3 years ago

So I think we need a solution to two problems:

  1. Generating symbol caches for existing packages/versions.
  2. Generating caches for new packages/versions.

The first point is neatly solved by @davidanthoff's proposal, while the second one could be solved by generating the cache via GitHub Actions and uploading it either to the package release or a central repo (not sure how to do the latter securely though).

Then a client would check the central location for a cache first and otherwise try to download one from the package repo.

We could also think about hosting caches on the package servers, but a) bandwidth is a concern and b) that still doesn't solve the problem of where to generate them.

ZacLN commented 3 years ago

And so, from the client side, would it be as simple as looking to see whether you can download the relevant file from e.g. https://symbolcache.julia-vscode.org/Julia-1.7.0-DEV.204-x86_64--normal-0a9c92f99fc50f3d68fdfa2c6c129f92c83d4914?

ZacLN commented 3 years ago

Initial idea of how the client side may look: https://github.com/julia-vscode/SymbolServer.jl/pull/204

davidanthoff commented 3 years ago

For a more automatic thing I thought we could somehow integrate with the bots that already run on the registry. For example, the bot on the registry that is currently doing the tagbot issues could also open an issue on https://github.com/julia-vscode/symbolcache, and that would then trigger a github action that indexes the new package, commits that and I think that is all we would need. I think hooking into the registry would be best because the registry merge event is really when we would want to create a new cache item in terms of timing.

In terms of hosting, I agree that in an ideal world this would actually be part of the package server, but I think we can start out with something less integrated and just get a sense what the traffic load etc from all of this is.

@ZacLN Yep, I think on the client side the logic is literally just: a) check whether the cache file exists on disc (as we do already), b) if not, try to download it from a URL like you wrote down, c) if that doesn't work, try to generate the cache file locally as we already do.

In terms of URLs, I was thinking maybe something like this for packages:

https://symbolcache.julia-vscode.org/v1/packages/PACKAGE_UUID/COMMIT_HASH.zip

Note that this a) doesn't have the Julia version in it anymore, nor b) the binary platform.

For stdlib packages I think we could go with something like:

https://symbolcache.julia-vscode.org/v1/stdlib/PACKAGE_UUID/JULIA_VERSION.zip

And for base:

https://symbolcache.julia-vscode.org/v1/base/JULIA_VERSION.zip