locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.34k stars 360 forks source link

ZoomedLayoutScheme not compatible with LCC projection (EPSG:3978) #3118

Open jsmoreau opened 5 years ago

jsmoreau commented 5 years ago

https://github.com/locationtech/geotrellis/blob/9825fd220f24d941e7edca8a9db05a4616a8024f/spark/src/main/scala/geotrellis/spark/tiling/ZoomedLayoutScheme.scala#L26

The worldExtent is being wrongly calculated when using epsg:3978. I think it has somethink to do with this class being called when ZoomedLayoutScheme is created:

/**
  * This package is concerned with translation of coordinates or extents between
  * geographic extents and the grid space represented by SpatialKey(col, row) coordinates,
  * the layout that defines that grid space, as well as functionality for cutting tiles into
  * a uniform grid space.
  */
package object tiling {
  private final val WORLD_WSG84 = Extent(-180, -89.99999, 179.99999, 89.99999)

  implicit class CRSWorldExtent(crs: CRS) {
    def worldExtent: Extent =
      crs match {
        case LatLng =>
          WORLD_WSG84
        case WebMercator =>
          Extent(-20037508.342789244, -20037508.342789244, 20037508.342789244, 20037508.342789244)
        case Sinusoidal =>
          Extent(-2.0015109355797417E7, -1.0007554677898709E7, 2.0015109355797417E7, 1.0007554677898709E7)
        case _ =>
          WORLD_WSG84.reproject(LatLng, crs)
      }
  }
}

I created a copy that works by hardcoding the extent and also adjusting the EARTH_CIRCUMFERENCE value to match the width of applicability of LCC.

object CustomLCCZoomedLayoutScheme {
//  val EARTH_CIRCUMFERENCE = 2 * math.Pi * Haversine.EARTH_RADIUS
  val EARTH_CIRCUMFERENCE = 4087995d

  val DEFAULT_TILE_SIZE = 256
  val DEFAULT_RESOLUTION_THRESHOLD = 0.1

  def layoutColsForZoom(level: Int): Int = math.pow(2, level).toInt
  def layoutRowsForZoom(level: Int): Int = math.pow(2, level).toInt

  def apply(crs: CRS, tileSize: Int = DEFAULT_TILE_SIZE, resolutionThreshold: Double = DEFAULT_RESOLUTION_THRESHOLD) =
    new CustomLCCZoomedLayoutScheme(crs, tileSize, resolutionThreshold)

  def layoutForZoom(zoom: Int, layoutExtent: Extent, tileSize: Int = DEFAULT_TILE_SIZE): LayoutDefinition = {
    if(zoom < 0)
      sys.error("TMS Tiling scheme does not have levels below 0")
    LayoutDefinition(layoutExtent, TileLayout(layoutColsForZoom(zoom), layoutRowsForZoom(zoom), tileSize, tileSize))
  }
}

class CustomLCCZoomedLayoutScheme(val crs: CRS, val tileSize: Int, val resolutionThreshold: Double) extends LayoutScheme {
    import CustomLCCZoomedLayoutScheme.{EARTH_CIRCUMFERENCE, layoutColsForZoom, layoutRowsForZoom}

    /** This will calcluate the closest zoom level based on the resolution in a UTM zone containing the point.
      * The calculated zoom level is up to some percentage (determined by the resolutionThreshold) less resolute then the cellSize.
      * If the cellSize is more resolute than that threshold's allowance, this will return the next zoom level up.
      */
    def zoom(x: Double, y: Double, cellSize: CellSize): Int = {
      val ll1 = Point(x + cellSize.width, y + cellSize.height).reproject(crs, LatLng)
      val ll2 = Point(x, y).reproject(crs, LatLng)
      // Try UTM zone, if not, use Haversine distance formula
      val dist: Double =
        if(UTM.inValidZone(ll1.y)) {
          val utmCrs = UTM.getZoneCrs(ll1.x, ll1.y)
          val (p1, p2) = (ll1.reproject(LatLng, utmCrs), ll2.reproject(LatLng, utmCrs))

          math.max(math.abs(p1.x - p2.x), math.abs(p1.y - p2.y))
        } else {
          Haversine(ll1.x, ll1.y, ll2.x, ll2.y)
        }
      val z = (math.log(EARTH_CIRCUMFERENCE / (dist * tileSize)) / math.log(2)).toInt
      val zRes = EARTH_CIRCUMFERENCE / (math.pow(2, z) * tileSize)
      val nextZRes = EARTH_CIRCUMFERENCE / (math.pow(2, z + 1) * tileSize)
      val delta = zRes - nextZRes
      val diff = zRes - dist

      val zoom =
        if(diff / delta > resolutionThreshold) {
          z.toInt + 1
        } else {
          z.toInt
        }

      zoom
    }

    def levelFor(extent: Extent, cellSize: CellSize): LayoutLevel = {
      val worldExtent = Extent(-3753241.79, 403745.10, 333754.20, 4290855.71)
      val l =
        zoom(extent.xmin, extent.ymin, cellSize)

      levelForZoom(worldExtent, l)
    }

    def levelForZoom(id: Int): LayoutLevel =
      levelForZoom(Extent(-3753241.79, 403745.10, 333754.20, 4290855.71), id)

    def levelForZoom(worldExtent: Extent, id: Int): LayoutLevel = {
      if(id < 0)
        sys.error("TMS Tiling scheme does not have levels below 0")
      LayoutLevel(id, LayoutDefinition(worldExtent, TileLayout(layoutColsForZoom(id), layoutRowsForZoom(id), tileSize, tileSize)))
    }

    def zoomOut(level: LayoutLevel) = {
      val layout = level.layout
      val newZoom = level.zoom - 1
      val newSize = math.pow(2, newZoom).toInt
      new LayoutLevel(
        zoom = newZoom,
        layout = LayoutDefinition(
          extent = layout.extent,
          tileLayout = TileLayout(
            newSize,
            newSize,
            layout.tileCols,
            layout.tileRows
          )
        )
      )
    }

    def zoomIn(level: LayoutLevel) = {
      val layout = level.layout
      val newZoom = level.zoom + 1
      val newSize = math.pow(2, newZoom).toInt
      new LayoutLevel(
        zoom = newZoom,
        layout = LayoutDefinition(
          extent = layout.extent,
          tileLayout = TileLayout(
            newSize,
            newSize,
            layout.tileCols,
            layout.tileRows
          )
        )
      )
    }
  }
pomadchin commented 5 years ago

Thank you so much for looking into it and creating this issue!

esmeetu commented 4 years ago

@jsmoreau hey, can u provide the image ingest code using your own zoomlayoutscheme? After I define a customzoomlayoutscheme like yours which changing the crs.worldExtent, my spark job falls into infinite loop. my code:

 val inputRdd: RDD[(ProjectedExtent, MultibandTile)] =
    sc.hadoopMultibandGeoTiffRDD(inputPath).mapValues(m => m.withNoData(Option(0)))

    val (_, rasterMetaData) = CollectTileLayerMetadata.fromRDD(inputRdd,FloatingLayoutScheme(512))

    val tiled: RDD[(SpatialKey, MultibandTile)] =
    inputRdd.tileToLayout(rasterMetaData.cellType, rasterMetaData.layout, Bilinear).repartition(200)

    val layoutScheme = CustomZoomedLayoutScheme(CRS.fromEpsgCode(2438), tileSize = 256)

    val (zoom, reprojected): (Int, RDD[(SpatialKey, MultibandTile)] with Metadata[TileLayerMetadata[SpatialKey]]) =
      MultibandTileLayerRDD(tiled, rasterMetaData)
        .reproject(CRS.fromEpsgCode(2438), layoutScheme, Bilinear)
    // Create the attributes store that will tell us information about our catalog.
    val attributeStore = FileAttributeStore(outputPath)
    // Create the writer that we will use to store the tiles in the local catalog.
    val writer = FileLayerWriter(attributeStore)
    // Pyramiding up the zoom levels, write our tiles out to the local file system.
    Pyramid.upLevels(reprojected, layoutScheme, zoom, Bilinear) { (rdd, z) =>
      val layerId = LayerId("test", z)
      // If the layer exists already, delete it out before writing
      if(attributeStore.layerExists(layerId)) {
        new FileLayerManager(attributeStore).delete(layerId)
      }
      writer.write(layerId, rdd, ZCurveKeyIndexMethod)
    }

it prints out

DEBUG org.apache.spark.memory.TaskMemoryManager - Task 105 acquired 7.0 MB for org.apache.spark.util.collection.ExternalSorter@10f6bf18
DEBUG org.apache.spark.memory.TaskMemoryManager - Task 105 acquired 18.8 MB for org.apache.spark.util.collection.ExternalSorter@10f6bf18
DEBUG org.apache.spark.memory.TaskMemoryManager - Task 105 acquired 31.7 MB for org.apache.spark.util.collection.ExternalSorter@10f6bf18
DEBUG org.apache.spark.memory.TaskMemoryManager - Task 105 acquired 88.3 MB for org.apache.spark.util.collection.ExternalSorter@10f6bf18
pomadchin commented 4 years ago

Hey @esmeetu are you sure that it is a loop? Can you print some details and throw here spark ui pictures (with tasks and executors)?

esmeetu commented 4 years ago

@pomadchin thanks for your quick reply! output sparkjob sparkui sparkstage

esmeetu commented 4 years ago

Spark UI page Executors tab is blank.

pomadchin commented 4 years ago

@esmeetu how do you submit it and what cluster do you have? Also can you show the picture of an app that hangs for an hour or so (just to see the state at what you think it hangs)

esmeetu commented 4 years ago

@pomadchin i use spark local mode.

    val conf =
      new SparkConf()
        .setMaster("local[*]")
        .setAppName("Spark Tiler")
        .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
        .set("spark.kryo.registrator", "geotrellis.spark.io.kryo.KryoRegistrator")

    val sc = new SparkContext(conf)

after one hour.. sparkjob1

pomadchin commented 4 years ago

@esmeetu gotcha, so it does not hang - it is just slow. Also you don’t have any executors - it processes everything on a single executor - try to limit each executor with a single core - it will allow you to achieve some parallelism.

esmeetu commented 4 years ago

@pomadchin emm... i look into the reproject src code. it looks strange. the newKey is out of layout of 20 zoom level.

debug

pomadchin commented 4 years ago

@esmeetu ha indeed; check out the extents / crs / etc. I would recommend you to create a unit test / run keysForGeometry function out of spark; it looks like it generates too many keys and they are probably different from what you expect. But it is again not hanging;

esmeetu commented 4 years ago

@pomadchin My input tif is about 100MB, i shouldn't be slow like this. how to write the unit test to test that function? i have no idea. In debug, it will generate 9064 spatialkeys, and these spatialkeys makes no sense i think.

pomadchin commented 4 years ago

@esmeetu wait; what projection do you use? @jsmoreau defines the world extent as Extent(-3753241.79, 403745.10, 333754.20, 4290855.71) and yours is out of these bounds. Are you also in Canada Atlas Lambert?

esmeetu commented 4 years ago

i use EPSG: 2438, and i calculate the worldExtent by Extent(77.45, 37.0, 88.0, 41.99).reproject(CRS.fromEpsgCode(4214), crs) crs in above code is CRS.fromEpsgCode(2438)

esmeetu commented 4 years ago

Problem seems resolved. After i changed the EARTH_CIRCUMFERENCE value to my extent's length, it runs normally. Another question. Why the variable EARTH_CIRCUMFERENCE assigned Integer type?

pomadchin commented 4 years ago

@ecgreb gz that you figured it out.

Answering to your next question, EARTH_CIRCUMFERENCE is of type Double: https://github.com/locationtech/geotrellis/blob/master/layer/src/main/scala/geotrellis/layer/ZoomedLayoutScheme.scala#L26

Even though 2 here is integer, multiplication on Double converts the entire value into Double:

2 * math.Pi //> res0: Double = 6.283185307179586

Also the author of this post suggests to set it this way:

val EARTH_CIRCUMFERENCE = 4087995d

It also sets EARTH_CIRCUMFERENCE to a Double value (look at the small letter d).

esmeetu commented 4 years ago

@pomadchin Doesn't it lose some precisions? Keep six or more decimals is better or not?

pomadchin commented 4 years ago

@esmeetu excuse me - I don’t quite follow you at this point; what are you talking about? Is your question why @jsmoreau used a rounded value? If so - it is better to ask him, I think this precision just worked for his goals , I don’t know how much the precision here affects the results (I can only guess that not much and it is not that important here since your goal is to generate some keys for the input).

esmeetu commented 4 years ago

@pomadchin yes, i think he might lose precision that using a rounded value when generating pyramid. Thanks a lot! :100:

pomadchin commented 4 years ago

@esmeetu and thank you as well for diving into it 👍