sbt / sbt-projectmatrix

MIT License
124 stars 14 forks source link

Approaches to run subsets of the matrix #48

Closed keynmol closed 3 years ago

keynmol commented 3 years ago

Hello!

I love the projectmatrix approach and use it in 100% of my cross-building needs (it also makes it much nicer in the IDE, as Scala versions are represented as projects, so if you're doing something on a shared source which 1 scala version doesn't like - you get that feedback)

The only issue I found is that when the matrix gets really big and involves Scala.js, running just test can break JVM and will need a lot of memory. One can circumvent that by setting concurrency tags, but then all the output gets mixed up together.

For example in weaver we have several axes:

  1. Cats Effect version
  2. Scala version
  3. Platform (js or jvm)

And most projects fill all 3, some - only 2

To solve the issue, we're generating the commands based on the matrix, i.e. test_CE2_JS_3_0_0_RC1 - and invoke those on CI.

This works nicely on Github Actions as we can parallelise the build and immediately see which change broke which combination.

Problem is - generating those commands is hard as you can't depend on ScopeFilter (when in ThisBuild/commands += ...) to get all the virtual axes values with the project name - so I had to manually collect all the ProjectReference and parse the project name.

Are there any recommended approaches to make this a bit more automatic? Is there something I can do in the projectmatrix source itself to aid me with at least getting the full

List[(LocalProject#name, Seq[VirtualAxis])]

in the commands interface?

I'm open to any solutions, really, and would love to contribute here if it will require internal support.

eed3si9n commented 3 years ago

Problem is - generating those commands is hard as you can't depend on ScopeFilter (when in ThisBuild/commands += ...) to get all the virtual axes values with the project name - so I had to manually collect all the ProjectReference and parse the project name.

There's already:

lazy val core12 = core.jvm("2.12.8")

lazy val appConfig12_212 = app.finder(config12, VirtualAxis.jvm)("2.12.8")

so maybe you want a variant of that like core.jvmAll and app.finderAll() that would return a list?

keynmol commented 3 years ago

Oh, I forgot about the finder, thanks.

Let me play with it locally and see what the approach would look like - finderAll would be useful as I can assemble this big list from all the project matrices, and then go over it with a pre-defined set of dimensions, collecting the required commands.

seems like just adding this method:

    override def finderAll(): List[(Project, Seq[VirtualAxis])] =
      resolvedMappings.map { case(row, project) =>
        project -> row.axisValues
      }.toList

To ProjectMatrixDef exposes good information to be able to achieve this

keynmol commented 3 years ago

Nice, so with the finderAll I was able to generalise:

  case class Dimension(
      matchName: PartialFunction[VirtualAxis, String],
      default: String
  )

  object Dimension {
    def create(default: String)(pf: PartialFunction[VirtualAxis, String]) =
      Dimension(pf, default)
  }

  def crossCommand(
      cmd: String,
      alias: Option[String],
      matrix: sbt.internal.ProjectMatrix,
      dimensions: Dimension*
  ): Seq[Command] = {
    val allProjects = matrix.finderAll

    val buckets = List.newBuilder[(Seq[String], Project)]

    allProjects.foreach {
      case (proj, axes) =>
        val found = dimensions.map(dim =>
          axes.collectFirst(dim.matchName).getOrElse(dim.default)
        )
        buckets += (found -> proj)
    }

    buckets.result
      .groupBy(_._1)
      .map {
        case (segments, results) =>
          val projects = results.map(_._2.id)
          val groupCmd = alias.getOrElse(cmd) + "-" + segments.mkString("-")

          groupCmd -> projects.map(id => id + "/" + cmd)
      }
      .toSeq
      .map {
        case (alias, subcommands) =>
          Command.command(alias) { state =>
            subcommands.foldRight(state)(_ :: _)
          }

      }
  }

  val scalaBinaryDimension: Dimension = Dimension.create("") {
    case v: VirtualAxis.ScalaVersionAxis if v.scalaVersion.startsWith("2.") =>
      v.scalaVersion.split('.').take(2).mkString("_")
    case v: VirtualAxis.ScalaVersionAxis => v.scalaVersion.replace('.', '_')
  }

  val platformDimension: Dimension = Dimension.create("jvm") {
    case v: VirtualAxis.PlatformAxis => v.value
  }

And in the build code one can just do

    commands ++= crossCommand(
      "test",
      None,
      core,
      scalaBinaryDimension,
      platformDimension
    ),
    commands ++= crossCommand(
      "scalafmt",
      Some("checkScalafmt"),
      core,
      scalaBinaryDimension,
      platformDimension
    )

Barring some nastiness around the command names (dots aren't allowed?), would this functionality be useful in the plugin itself?

It's probably possible to just implement command-safe name parameter on virtual axes themselves and remove Dimension altogether

keynmol commented 3 years ago

Once #50 is released, I will implement some ideas from here in a separate plugin (and use it in some matrix-heavy projects to verify the approach) and then we can revisit/open a new issue to see if it makes sense to have this functionality in projectmatrix itself