julia-actions / cache

A shortcut action to cache Julia artifacts, packages, and registries.
MIT License
38 stars 8 forks source link

enable registries by default #62

Closed IanButterworth closed 7 months ago

IanButterworth commented 1 year ago

Closes https://github.com/julia-actions/cache/issues/7

julia-buildpkg and julia-runtest now both use the pkgserver registry, so this should work

https://github.com/julia-actions/julia-buildpkg/pull/31 - landed properly on https://github.com/julia-actions/julia-buildpkg/releases/tag/v1.4.2

https://github.com/julia-actions/julia-runtest/pull/74 - landed properly on https://github.com/julia-actions/julia-runtest/releases/tag/v1.9.1

IanButterworth commented 10 months ago

@DilumAluthge can you review this please

DilumAluthge commented 10 months ago

julia-buildpkg and julia-runtest now both use the pkgserver registry, so this should work

Specifically, it's because they use the non-extracted tarball, right? IIRC, the problem with slowness on Windows was due to the time it takes to extract all the files?

DilumAluthge commented 10 months ago

and I think we can consider this a bug fix

To elaborate what I meant: if we can consider this to be a bugfix, then we don't have to make a breaking release of this action.

SaschaMann commented 9 months ago

+1 on considering this a bug fix. I don't have the capacity to benchmark it either but feel free to merge and tag a release if you deem it ready :)

IanButterworth commented 9 months ago

I've been thinking and I'm not sure we should be caching the registries by default.

Since 1.9, julia will only update the registry once per day, unless an explicit Pkg.update is called.

If we cache the registry, and now that scratchspaces are cached by default (where Pkg stores the date of the last registry update), Pkg will be able to know when the registry was last updated and avoid doing it unless update is called.

That means that if a user wants to pick up a new version of a package that was just registered, simply re-running CI won't do that when the packages are Pkg.add-ed or Pkg.instantiate-ed for 24 hrs.

They'd be forced instead to either clear out the cache, or add a Pkg.update

Also not caching the registry won't hurt the goal of ensuring precompile cache hits.

giordano commented 9 months ago

Since 1.9, julia will only update the registry once per day, unless an explicit Pkg.update is called.

The other actions could force a Pkg.Registry.update before installing the dependencies?

giordano commented 9 months ago

However I maintain that enabling caching of the git registry on Windows can be a performance hit (but the git registry on Windows is a problem anyway).

IanButterworth commented 9 months ago

The other actions could force a Pkg.Registry.update before installing the dependencies?

So the registry would be first downloaded from the cache, then an update downloaded. I think that defeats the point.

And yeah, I doubt we'd want to do this for git caches.

Perhaps we kick this ball further down the road.. The main gains to be had in duration are in getting precompile caches cached. Once that's stable we could see if this is worth it?

giordano commented 9 months ago

So the registry would be first downloaded from the cache, then an update downloaded. I think that defeats the point.

Not really: if you have a git registry, pulling a few commits is much quicker than doing a full clone, and if the compressed registry is already up-to-date nothing is downloaded at all, no?

omus commented 7 months ago

So Pkg.instantiate will automatically update the registry when no manifest file is present.

In the case where a manifest file is present then we require exact versions of packages so as long as the cached registry includes those required versions the instantiation should succeed. In a scenario where we're re-using a long lived cache and the manifest uses a version outside of the registry we end up falling back to updating the registries.

A quick example of julia-actions/julia-runtest updating the registries after using cache-registries: true with a successful cache hit:

Run julia-actions/julia-runtest@v1
Run echo "JULIA_PKG_SERVER_REGISTRY_PREFERENCE=${JULIA_PKG_SERVER_REGISTRY_PREFERENCE:-eager}" >> ${GITHUB_ENV}
Run # Functionality only currently works on a narrow range of Julia versions... see #76
Run # The Julia command that will be executed
    ...
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `git@github.com:JuliaRegistries/General.git`

I think we're good to enable registry caching by default.

IanButterworth commented 7 months ago

Thanks. Based on that it does seem we can enable this by default.

There are cases where jobs do things like this

julia -e 'import Pkg; Pkg.add("JuliaFormatter"); format()`

i.e. use non-stdlib packages without a Project/Manifest

These cases will be affected by the registry update 24hrs cool-off, but in these cases the package are being used as tooling that is likely to be less critical to be up to date.

If that's the case the job can call Pkg.Registry.update() before add, or we could patch Pkg to avoid the cool-down period if on CI.