locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.34k stars 361 forks source link

Bad performance with ZSpaceTimeKeyIndex #3293

Open laxiwuka opened 4 years ago

laxiwuka commented 4 years ago

Is your feature request related to a problem? Please describe.

Bad performance when computing indexRanges with geotrellis.store.index.zcurve.ZSpaceTimeKeyIndex

Describe the solution you'd like

A faster way to compute indexRanges

Describe alternatives you've considered

I tried using org.locationtech.sfcurve.zorder.Z3.zranges instead and it looks like performance is a lot better.

Example:

// took me 50 seconds
geotrellis.store.index.zcurve.Z3.zranges(17592186191908, 17597546219268)
// ran in 2.5 seconds
org.locationtech.sfcurve.zorder.Z3.zranges(Array(ZRange(17592186191908, 17597546219268)), maxRecurse = Some(11))
pomadchin commented 3 years ago

Another example (I didn't wait until it finishes computations, but last numbers I saw were 400 seconds):

// https://github.com/locationtech/geotrellis/issues/3293
// https://github.com/locationtech/geotrellis/issues/3293#issuecomment-841416743
// Should be a part of Z3Spec.scala
it("Z3 index wide ranges computation") {
  // start 2010-01-01
  // end 2022-12-31
  // at zoom level 12 (2^12)
  val keyBounds = KeyBounds(
    SpaceTimeKey(0, 0, 1262275200000L),
    SpaceTimeKey(8192, 8192, 1672416000000L)
  )

  val index = ZCurveKeyIndexMethod.byDay().createIndex(keyBounds)

  val res = index.indexRanges(
    SpaceTimeKey(0, 0, 1262275200000L),
    SpaceTimeKey(8192, 8192, 1672416000000L)
  )

  println(res)
}

The spec lives here

pomadchin commented 3 years ago

As a workaround for now it is recommended to use Hilbert curve

metasim commented 3 years ago

Long shot, but this issue made me think of https://github.com/locationtech/geotrellis/issues/2358 . I also have a vague memory of the time domain resolution having a big impact on memory usage, but not 100% about that.

pomadchin commented 3 years ago

@metasim ranges generation gots stuck here; so mb you're pointing to a correct direction

metasim commented 3 years ago

@pomadchin Something tells me this was the killer op. That and the toSeq.... and the recursion...

I thought GT used SFCurve? Does this op do basically the same thing as this?

pomadchin commented 3 years ago

@metasim we haven't switched to sfcurve yet, so that could have resolved this issue I guess; I think this is the PR that changed the Z ranges behavior https://github.com/locationtech/sfcurve/pull/15