bkirwi / decline

A composable command-line parser for Scala.
http://monovore.com/decline/
Apache License 2.0
647 stars 71 forks source link

Spark shell started with assembly jar cannot resolve decline's cats dependency #437

Closed zartstrom closed 2 years ago

zartstrom commented 2 years ago

Hi there, I have trouble using decline within a fat jar in spark. I posted a question on StackOverflow and want to mention it here as well, I hope it is relevant to others. It might be that the issue is not decline specific, sorry in advance.

This is my github repo to reproduce the error.


I want to use decline to parse command line parameters for a spark application. I use sbt assembly to create a fat jar and use it in the spark-submit. Unfortunately I get an error java.lang.NoSuchMethodError: cats.kernel.Semigroup$.catsKernelMonoidForList()Lcats/kernel/Monoid; when the parameters get parsed (example below).

This is my code:

package example

import cats.implicits._
import com.monovore.decline._

object Minimal {

  case class Minimal(input: String, count: Int)

  val configOpts: Opts[Minimal] = (
    Opts.option[String]("input", "the input"),
    Opts.option[Int]("count", "the count")
  ).mapN(Minimal.apply)

  def parseMinimalConfig(
    args: Array[String]
  ): Either[Help, Minimal] = {
    val command = Command(name = "min-example", header = "my-header")(configOpts)
    command.parse(args)
  }
}

and this is my build.sbt:

name := "example"
version := "0.1"

scalaVersion := "2.12.10"
libraryDependencies ++= Seq("com.monovore" %% "decline" % "2.3.0")

This is how I reproduce the error locally (spark version is 3.1.2)

~/playground/decline-test » ~/apache/spark-3.1.2-bin-hadoop3.2/bin/spark-shell --jars "target/scala-2.12/example-assembly-0.1.jar" 
22/08/31 14:36:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://airi:4040
Spark context available as 'sc' (master = local[*], app id = local-1661949407775).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.1.2
      /_/

Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_345)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import example.Minimal._
import example.Minimal._

scala> parseMinimalConfig(Array("x", "x"))
java.lang.NoSuchMethodError: cats.kernel.Semigroup$.catsKernelMonoidForList()Lcats/kernel/Monoid;
  at com.monovore.decline.Help$.optionList(Help.scala:74)
  at com.monovore.decline.Help$.detail(Help.scala:105)
  at com.monovore.decline.Help$.fromCommand(Help.scala:50)
  at com.monovore.decline.Parser.<init>(Parser.scala:21)
  at com.monovore.decline.Command.parse(opts.scala:20)
  at example.Minimal$.parseMinimalConfig(Minimal.scala:19)
  ... 49 elided

scala> :quit

Interestingly adding the assembled jar to the scala classpath does not yield the same error but gives the expected help message. My local scala version is 2.12.16 and the spark scala version is 2.12.10, but I'm unsure whether this can be the cause.

~/playground/decline-test » scala -cp "target/scala-2.12/example-assembly-0.1.jar"                                                
Welcome to Scala 2.12.16-20220611-202836-281c3ee (OpenJDK 64-Bit Server VM, Java 1.8.0_345).
Type in expressions for evaluation. Or try :help.

scala> import example.Minimal._
import example.Minimal._

scala> parseMinimalConfig(Array("x", "x"))
res0: Either[com.monovore.decline.Help,example.Minimal.Minimal] =
Left(Unexpected argument: x

Usage: command --input <string> --count <integer>

our command

Options and flags:
    --input <string>
        the input
    --count <integer>
        the count)

scala>

I also tried scala 2.13 with spark 3.2.2 and I got the same error, although need to double check on that. What could I be missing?

bkirwi commented 2 years ago

So, if this is what I think it is: a pretty common issue with Spark, and not decline-specific. Briefly: Spark includes its own version of cats on the classpath, which is being used in preference to whatever decline is pulling in. You could try and confirm this by checking which version of cats is included in the spark distro, and which version your build depends on, and checking whether they're supposed to be binary compatible.

Assuming this is your issue, a few ways to fix it:

  1. Setting spark.driver.userClassPathFirst=true in your spark config.
  2. Trying to find an old version of decline that happens to use a version of cats that's close enough to Spark's to dodge the issue.
  3. Shading cats with a rename rule, so the names don't conflict with whatever's running on the Spark cluster.

1 is the easy option, though in my experience most complex apps end up wanting 3 sooner or later!

zartstrom commented 2 years ago

Thanks for the hints @bkirwi, I went for option 3 and it works just fine.

assembly / assemblyShadeRules := Seq(
  ShadeRule.rename("cats.**" -> "repackaged.cats.@1").inAll
)