The current caching capability significantly improves runtimes for remote schemas when there is a single remote file to download, but does nothing to improve the case where there are refs to resolve. Refs are cached in-memory by referencing, but discarded between runs.
For faster runs, check-jsonschema should cache resolved refs on disk as well.
Some basic requirements:
this must respect the --no-cache setting
probably the same object which is used for fetching remote schemas should be passed to the ref resolver
filenames must be chosen such that there are no conflicts between different schemas (users won't be able to control filenames)
if the new file-and-dir layout for these data conflicts with the existing cache dir layout, that needs resolution
ideal: design a strategy to migrate cache data for the next 1-2 calendar years
acceptable: ignore old cache data, provide a changelog note on how to clean it up
the behavior here need to be tested
[!NOTE]
A friend of mine suggested putting cache data into a DB (e.g. sqlite) when we talked about this, so that it could be annotated with richer metadata and structure. Although that might be a good idea longer term, I don't want to reach for that quite yet -- I think this can be solved with a good dir structure for now.
Here's one initial idea, for evaluation:
each $ref is canonically named {md5 of the absolute URI}.json
in the ~/.cache/check_jsonschema/ dir, add a dir named refs/ (the schemas are in a dir named downloads/, which now seems like a suboptimal name but will suffice)
ref resolution stores resolved refs in the refs/ dir
Original use-case sourced from this PR: #451
The current caching capability significantly improves runtimes for remote schemas when there is a single remote file to download, but does nothing to improve the case where there are refs to resolve. Refs are cached in-memory by
referencing
, but discarded between runs.For faster runs,
check-jsonschema
should cache resolved refs on disk as well.Some basic requirements:
--no-cache
settingHere's one initial idea, for evaluation:
{md5 of the absolute URI}.json
~/.cache/check_jsonschema/
dir, add a dir namedrefs/
(the schemas are in a dir nameddownloads/
, which now seems like a suboptimal name but will suffice)refs/
dir