kurtosis-tech / kurtosis

A platform for packaging and launching ephemeral backend stacks with a focus on approachability for the average developer.
https://docs.kurtosistech.com/
Apache License 2.0
357 stars 53 forks source link

Cache Starlark packages in the cluster, and clean old ones automatically, so Kurtosis is runnable offline #1308

Open mieubrisse opened 1 year ago

mieubrisse commented 1 year ago

Background & motivation

If I kurtosis run github.com/kurtosis-tech/eth2-package while I'm offline, the APIC will try to clone github.com/kurtosis-tech/eth2-package, and all its Starlark dependencies (this is the only behaviour we support right now). This will fail because the user is offline and the APIC can't reach Github.

Desired behaviour

Whenever Kurtosis clones a package once, it caches that package and its dependencies locally inside the Kurtosis cluster so that if the user goes offline in the future, they can still potentially run the Starlark package.

  1. The Kurtosis cluster has a cross-enclave Starlark package LRU cache, that is added to whenever the APIC wants to pull a SL package. This is likely blocked by proper dependencies in Starlark packages, as well as hashes, so we have a unique key to cache off of.
  2. The cache is automatically emptied of old entries so it doesn't infinitely fill the user's disk
  3. The APIC resolves imports against the cache first

Thought: thinking about the logistics of sharing a volume across all the APICs, and having multiple APICs potentially modifying the thing at the same time, etc etc sounds like a gigantic nightmare. It really makes me think, yet again.... "Should we just merge the APICs and the Engine?"

How important is this to you?

Painful; the lack of this feature makes using Kurtosis frictionful.

mieubrisse commented 1 year ago

This came from a side-by-side debugging session with Barnabas, who was trying to run the eth2-package offline

leeederek commented 1 year ago

Hey @mieubrisse, is this painful because you currently cannot do local package imports? Would love to hear about the workflow a bit more. Thank you!

mieubrisse commented 1 year ago

@leeederek the problem is, if I kurtosis run github.com/kurtosis-tech/eth2-package while I'm offline, the APIC will try to clone github.com/kurtosis-tech/eth2-package, and all its Starlark dependencies (this is the only beahviour we support right now). This will fail because the user is offline and the APIC can't reach Github.

This ticket proposes that whenever Kurtosis clones a package once, it caches that package and its dependencies locally inside the Kurtosis cluster so that if the user goes offline in the future, they can still potentially run the Starlark package.

This is a ubiquitous pattern with languages - Go caches modules in your ~/go directory, NPM does this in the node_modules directory, Rust does it somewhere, etc.

leeederek commented 1 year ago

Maybe Q4 2023 / Q1 2024