Closed echeipesh closed 5 years ago
@echeipesh Note: I've not tried to run against Spark 2.4.x. So maybe there's something wrong there.
I wonder if some of this has to do with Scala logging using macros? Maybe an argument to get rid of it, and just use Java APIs?
If you're willing to try something, find any places where a package.scala
class inherits from LazyLogger
, and remove it, replacing call sites with something in the companion class or whatever... just to see if hiding it a bit more help.
@echeipesh Also, what version of Java is that running under?
I would say that figuring out how to shade this dependency would be enough. I don't have arguments to get rid of scala logging: It is more performant: inlines calls and makes calls to the underlying logger only in case the warning level allows it (sounds a bit naive but even these two things will reduce logging overhead).
@metasim This is all on Java 1.8
Based on your hunch removing the inheritance of LazyLogger
from rasterframes
package did the trick:
diff --git a/core/src/main/scala/org/locationtech/rasterframes/rasterframes.scala b/core/src/main/scala/org/locationtech/rasterframes/rasterframes.scala
index 1517e8f0..2b908c7f 100644
--- a/core/src/main/scala/org/locationtech/rasterframes/rasterframes.scala
+++ b/core/src/main/scala/org/locationtech/rasterframes/rasterframes.scala
@@ -21,7 +21,7 @@
package org.locationtech
import com.typesafe.config.ConfigFactory
-import com.typesafe.scalalogging.LazyLogging
+import com.typesafe.scalalogging.Logger
import geotrellis.raster.isData
import geotrellis.raster.{Tile, TileFeature}
import geotrellis.spark.{ContextRDD, Metadata, SpaceTimeKey, SpatialKey, TileLayerMetadata}
@@ -33,6 +33,7 @@ import org.locationtech.rasterframes.encoders.StandardEncoders
import org.locationtech.rasterframes.extensions.Implicits
import org.locationtech.rasterframes.model.TileDimensions
import org.locationtech.rasterframes.util.ZeroSevenCompatibilityKit
+import org.slf4j.LoggerFactory
import shapeless.tag.@@
import scala.reflect.runtime.universe._
@@ -43,8 +44,10 @@ package object rasterframes extends StandardColumns
with Implicits
with rasterframes.jts.Implicits
with StandardEncoders
- with DataFrameFunctions.Library
- with LazyLogging {
+ with DataFrameFunctions.Library {
+
+ @transient
+ protected lazy val logger: Logger = Logger(LoggerFactory.getLogger(getClass.getName))
@transient
private[rasterframes]
I have not tested it on DataBricks yet, but this produces an assembly where the imports of RasterFrame packages happens fine. I'll keep you posted.
Reproduced. I'll go ahead and fix this and any similar cases.
A bit further, but serialization error (despite transient
??):
$ spark-shell --jars /Users/sfitch/Coding/OSS/locationtech-rasterframes/pyrasterframes/target/scala-2.11/pyrasterframes-assembly-0.8.1-SNAPSHOT.jar
...
Spark context available as 'sc' (master = local[*], app id = local-1566321608265).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.3
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171)
Type in expressions to have them evaluated.
Type :help for more information.
scala> import org.locationtech.rasterframes._
import org.locationtech.rasterframes._
scala> import org.locationtech.rasterframes.datasource.raster._
import org.locationtech.rasterframes.datasource.raster._
scala> val rf = spark.read.raster.load("https://s22s-test-geotiffs.s3.amazonaws.com/luray_snp/B02.tif")
java.lang.AssertionError: assertion failed: unsafe symbol Logger (child of package scalalogging) in runtime reflection universe
at scala.reflect.internal.Symbols$Symbol.<init>(Symbols.scala:184)
at scala.reflect.internal.Symbols$TypeSymbol.<init>(Symbols.scala:3009)
at scala.reflect.internal.Symbols$ClassSymbol.<init>(Symbols.scala:3201)
at scala.reflect.internal.Symbols$StubClassSymbol.<init>(Symbols.scala:3496)
at scala.reflect.internal.Symbols$Symbol.newStubSymbol(Symbols.scala:498)
at scala.reflect.internal.pickling.UnPickler$Scan.readExtSymbol$1(UnPickler
...
Calling spark.withRasterFrames
triggers it too.
Caught the exception in the debugger after commenting out RF code triggering the AssertionError
, and maybe (?) anything inheriting from LazyLogging
has a problem? In the backtrace below, error occurred when Scala Reflection was attempting to resolve the type structure for Tile
.
Found this message on the stack in a frame further up, which makes me wonder if shading did't transform all references in all libraries:
missing or invalid dependency detected while loading class file 'geotrellis.raster.Tile'.
Could not access type LazyLogging in package com.typesafe.scalalogging,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'geotrellis.raster.Tile' was compiled against an incompatible version of com.typesafe.scalalogging.
Ran javap
directly:
$ javap -cp /Users/sfitch/Coding/OSS/locationtech-rasterframes/pyrasterframes/target/scala-2.11/pyrasterframes-assembly-0.8.1-SNAPSHOT.jar geotrellis.raster.Tile | grep scalalogging
public interface geotrellis.raster.Tile extends geotrellis.raster.CellGrid, geotrellis.raster.IterableTile, geotrellis.raster.MappableTile<geotrellis.raster.Tile>, rf.shaded.com.typesafe.scalalogging.LazyLogging {
and in the spark-shell
:
$ echo ':javap geotrellis.raster.Tile' | spark-shell --jars /Users/sfitch/Coding/OSS/locationtech-rasterframes/pyrasterframes/target/scala-2.11/pyrasterframes-assembly-0.8.1-SNAPSHOT.jar | grep scalalogging
public interface geotrellis.raster.Tile extends geotrellis.raster.CellGrid, geotrellis.raster.IterableTile, geotrellis.raster.MappableTile<geotrellis.raster.Tile>, rf.shaded.com.typesafe.scalalogging.LazyLogging
#3 = Utf8 Ljava/lang/Object;Lgeotrellis/raster/CellGrid;Lgeotrellis/raster/IterableTile;Lgeotrellis/raster/MappableTile<Lgeotrellis/raster/Tile;>;Lrf/shaded/com/typesafe/scalalogging/LazyLogging;
#12 = Utf8 rf/shaded/com/typesafe/scalalogging/LazyLogging
#13 = Class #12 // rf/shaded/com/typesafe/scalalogging/LazyLogging
Signature: #3 // Ljava/lang/Object;Lgeotrellis/raster/CellGrid;Lgeotrellis/raster/IterableTile;Lgeotrellis/raster/MappableTile<Lgeotrellis/raster/Tile;>;Lrf/shaded/com/typesafe/scalalogging/LazyLogging;
and both show all references as having been shaded. :-/
I think the problem might have something to do with Spark making heavy use of Scala TypeTag
s, which are compile-time generated constructs, likely reflecting the type paths as they were at compile time.
@echeipesh I'm worried that this problem isn't solvable without removing LazyLogging
from Tile
. My current theory is that you can't shade any type that is processed by the Spark SQL Encoder
subsystem due to its dependency on compile-time TypeTag
s. Since Tile
is core to RasterFrames, it's hard to see a workaround on this side of the fence, except perhaps some variant of this:
https://stackoverflow.com/a/40404371/296509
I'm not a fan because it means RasterFrames is essentially requiring an isolated version of GeoTrellis that has scalalogging shaded for it before its used by RasterFrames.
Branch with moving LazyLogging
around in RasterFrames.
https://github.com/s22s/rasterframes/commit/c07ccea238b900473eb471b7b97b99549e6ceca8
This issue was encountered while we were attempting to shade the
com.typesafe.scala-logging
classes in an effort to get RasterFrames running in DataBricks environment. DataBricks itself brings inscom.typesafe.scala-logging
version2.1.2
on the class path which is not binary compatible with version3.9.0
used by RasterFrames and will result inInstantiationException
Reproduce
The failure to shade can be reproduced locally and shows ups as a failed import.
The attempt is done against current
develop
by trying to hook into the mechanism inRFAssemblyPlugin
Then creating assembly using:
Inspecting the assembly jar we see the shading took effect:
Trying to use the assembly goes south:
Trying to see what is going on in
package org.locationtech.rasterframes.package
everything seems hunky-dory:Note that it implements the shaded version
rf.shaded.com.typesafe.scalalogging.LazyLogging