sbt / sbt-assembly

Deploy über-JARs. Restart processes. (port of codahale/assembly-sbt)
MIT License
1.95k stars 224 forks source link

Proposal: Add withAssemblyUnzipDirectory to `AssemblyOption` #445

Open er1c opened 3 years ago

er1c commented 3 years ago

Description

Today, we currently have a single setter (withAssemblyDirectory) to configure the working directory for the assembly process. This defaults to a directory within target/.

I propose adding two one additional setters:

Motivation

When assembling an über artifact, that has many library dependencies, the unzip process can be very IO intensive (I've seen just this step take 10+ minutes in our CI). These unzipped directories are very suitable for CI systems to persist in between job runs.

However, on a CI agent, the withCacheOutput is not always suitable to persist in-between job runs, especially if re-used for different branches with lots of changes. From a practical standpoint, the target/ directory is a perfect location for this scenario.

Additionally, other CI workflows may want to persist the withCacheOutput in-between runs, or manage pruning the contents of these directories differently than the withCacheUnzip directory cache.

Example Usages

Global Cache, Two Different Locations

  lazy val sbtAssemblyUnzipDirectory = taskKey[File]("Directory to cache sbt assembly library jars.")
~~  lazy val sbtAssemblyOutputDirectory = taskKey[File]("Directory to cache sbt assembly output.")~~
  lazy val sbtAssemblyDirectory = taskKey[File]("Directory to cache sbt assembly output.")

  assembly / assemblyOption := {
    val opt = (assembly / assemblyOption).value
    val dir = sbtAssemblyDirectory.value
    opt
      .withAssemblyUnzipDirectory(sbtAssemblyUnzipDirectory.value)
      .withCacheUnzip(true)  // Keep dependency .jar files pre-extracted
~~      .withAssemblyOutputDirectory(sbtAssemblyOutputDirectory.value)~~
      .withAssemblyDirectory(sbtAssemblyDirectory.value)
      .withCacheOutput(true) // Cache pre-merged assembly file .class files to diff changes
  }

Global Cache, One Location

  lazy val sbtAssemblyDirectory = taskKey[File]("Directory to cache sbt assembly output.")

  assembly / assemblyOption := {
    val opt = (assembly / assemblyOption).value
    val dir = sbtAssemblyDirectory.value
    opt
      .withAssemblyDirectory(dir)
      .withCacheUnzip(true)  // Keep dependency .jar files pre-extracted
      .withCacheOutput(true) // Cache pre-merged assembly file .class files to diff changes
  }

Global Unzip Cache, Local Output Cache

  lazy val sbtAssemblyUnzipDirectory = taskKey[File]("Directory to cache sbt assembly library jars.")

  assembly / assemblyOption := {
    val opt = (assembly / assemblyOption).value
    opt
      .withAssemblyUnzipDirectory(sbtAssemblyUnzipDirectory.value)
      .withCacheUnzip(true)  // Keep dependency .jar files pre-extracted
      .withCacheOutput(true) // Cache pre-merged assembly file .class files to diff changes
  }
eed3si9n commented 3 years ago

I see the motivation for changing assemblyUnzipDirectory, so that makes sense. assemblyOutputDirectory on the other hand sounds like it's pretty much what we have already with withAssemblyDirectory?

er1c commented 3 years ago

I see the motivation for changing assemblyUnzipDirectory, so that makes sense. assemblyOutputDirectory on the other hand sounds like it's pretty much what we have already with withAssemblyDirectory?

It probably is, I conceptually was thinking of withAssemblyDirectory as just setting both withAssemblyUnzipDirectory and withAssemblyOutputDirectory to the same directory. But perhaps that can just be an internal/implementation nuance, and just add the additional one setter withAssemblyUnzipDirectory for the public API 🤔

er1c commented 3 years ago

I'm also noodling on adding a separate task assemblyUnzip that could be used to bootstrap/pre-warm a CI agent, without doing the full assembly

er1c commented 3 years ago

I'm also noodling on adding a separate task assemblyUnzip that could be used to bootstrap/pre-warm a CI agent, without doing the full assembly

This might be close enough to the existing assemblyPackageDependency or assemblyCacheUnzipDependency

er1c commented 3 years ago

@eed3si9n what do you think about this name? Was also thinking assemblyCacheUnzipDependencies image