Consider adding an offline cache?

nogweii commented 4 months ago

Schemastore.org website is offline, returning 503, today and I've experienced a couple of issues with it last week (timeouts).

It seems like clients should improve and cache things offline. The YAML and JSON language servers should probably handle this themselves, but I think if there was a central cache that neovim manages it would be better. And your plugin seems like a natural fit for providing that API/managing the cache.

b0o commented 4 months ago

This plugin already maintains an offline "cache" of the catalog itself, although it does not cache individual schemas.

Currently, we use a GitHub action to poll the SchemaStore catalog on an hourly basis. If the catalog has changed, we re-generate the lua/schemastore/catalog.lua file, which is picked up by your Neovim installation the next time you update your plugins.

Off the top of my head, caching individual schemas could work in one of two ways:

In CI: Similar to how we generate catalog.lua, we could extend our GitHub action to copy all of the hundreds of schemas into the SchemaStore.nvim repo, and then use filesystem references instead of URLs to reference them from the catalog.
At runtime: Locally cache schemas on the fly as they are used.

Each approach has pros and cons:

Approach 1:

Pros:
- Simple to implement
- Best performance for the end-user
- Unaffected by SchemaStore.org outages
- Cache invalidation is easy in this case: the GH action will update the schemas any time they change, and they will be sync'd when the user updates the plugin.
Cons:
- Increased disk usage
- Users need to update the plugin in order to receive schema updates
- Only supports caching schemas in the catalog, not those which the user configures using require"schemastore".json.schemas { extra = { ... } }

Approach 2:

Pros:
- Balance of performance and disk usage, since schemas are only cached after use
Cons:
- Tricky to implement because this plugin itself doesn't know when a particular schema is used, it only provides the catalog to the LSP.
- Cache invalidation is more tricky. How/when should the plugin check for updates to an individual schema?
- If a schema isn't cached yet and the SchemaStore website goes down, this approach wouldn't help

Happy to hear any of your thoughts.

ElonH commented 3 months ago

Approach 2:

Pros:

If a schema isn't cached yet and the SchemaStore website goes down, this approach wouldn't help

If we add add an option in setup, and let user can add their common scheme id in the option.

Like Mason ensure_installed

SchemeStore can download and cache these in setup stage.

In my case, I need to run neovim on the pure offline machine.

We can pack the neovim data and conf, then move it and workaround on the offline machine.

nogweii commented 3 months ago

Thinking about it some more, since we don't have the cooperation of the language servers, the only feasible method is to download some set of schemas ahead of time and present them to the LSPs.

I think for the sake of avoiding a lot of unnecessary disk usage, which schemas are downloaded should be left to the user. Maybe we can identify a few very common ones to have some initial seed? (The YAML LSP basically does that for an ancient version of Kubernetes.)

b0o / SchemaStore.nvim

Consider adding an offline cache? #26