Open ZacLN opened 3 years ago
One simple way to start would be that we use https://azure.microsoft.com/en-us/services/app-service/static/ for hosting, and I could manually run the initial indexing on my machine, which is pretty beefy. I think as a short term solution that might be the fastest way that allows us to test things out.
I think once we have that figured out we can think about more long term solutions. One idea had been to host this on juliahub.com, I think?
Ok, that was easy :) The content of https://github.com/julia-vscode/symbolcache is now just available at https://symbolcache.julia-vscode.org/, supposedly completely geo-replicated etc.
Fantastic! I'll pull together a script to cache the general registry.
Relevant to this - do you know of anywhere that ranks Julia packages by use? This would ideally be by the number of other packages that use it as a dependency, but general popularity could be used to proxy this.
So I think we need a solution to two problems:
The first point is neatly solved by @davidanthoff's proposal, while the second one could be solved by generating the cache via GitHub Actions and uploading it either to the package release or a central repo (not sure how to do the latter securely though).
Then a client would check the central location for a cache first and otherwise try to download one from the package repo.
We could also think about hosting caches on the package servers, but a) bandwidth is a concern and b) that still doesn't solve the problem of where to generate them.
And so, from the client side, would it be as simple as looking to see whether you can download the relevant file from e.g. https://symbolcache.julia-vscode.org/Julia-1.7.0-DEV.204-x86_64--normal-0a9c92f99fc50f3d68fdfa2c6c129f92c83d4914?
Initial idea of how the client side may look: https://github.com/julia-vscode/SymbolServer.jl/pull/204
For a more automatic thing I thought we could somehow integrate with the bots that already run on the registry. For example, the bot on the registry that is currently doing the tagbot issues could also open an issue on https://github.com/julia-vscode/symbolcache, and that would then trigger a github action that indexes the new package, commits that and I think that is all we would need. I think hooking into the registry would be best because the registry merge event is really when we would want to create a new cache item in terms of timing.
In terms of hosting, I agree that in an ideal world this would actually be part of the package server, but I think we can start out with something less integrated and just get a sense what the traffic load etc from all of this is.
@ZacLN Yep, I think on the client side the logic is literally just: a) check whether the cache file exists on disc (as we do already), b) if not, try to download it from a URL like you wrote down, c) if that doesn't work, try to generate the cache file locally as we already do.
In terms of URLs, I was thinking maybe something like this for packages:
https://symbolcache.julia-vscode.org/v1/packages/PACKAGE_UUID/COMMIT_HASH.zip
Note that this a) doesn't have the Julia version in it anymore, nor b) the binary platform.
For stdlib packages I think we could go with something like:
https://symbolcache.julia-vscode.org/v1/stdlib/PACKAGE_UUID/JULIA_VERSION.zip
And for base:
https://symbolcache.julia-vscode.org/v1/base/JULIA_VERSION.zip
For now, a couple of questions: