locationtech / geotrellis

GeoTrellis is a geographic data processing engine for high performance applications.
http://geotrellis.io
Other
1.32k stars 360 forks source link

Unable to get attributeTable param when accessing a GeoTrellis layer stored in HBase via URI #3528

Closed RunBoo closed 7 months ago

RunBoo commented 7 months ago

Describe the bug

Unlike GeoTrellis layers stored in HDFS, S3, or local files, layers stored in HBase cannot be organized into a catalog within the URI like this:

 *  @example "s3://bucket/catalog?layer=name&zoom=10"
 *  @example "hdfs://data-folder/catalog?layer=name&zoom=12&band_count=5"
 *  @example "gt+file:///tmp/catalog?layer=name&zoom=5"
 *  @example "/tmp/catalog?layer=name&zoom=5"

https://github.com/locationtech/geotrellis/blob/master/store/src/main/scala/geotrellis/store/GeoTrellisPath.scala#L41-L44 There is no example for HBase.

To Reproduce

When retrieving GT layers from HBase using a URI: gt+hbase://znkgtz01:2181?master=znkgtz01&attributes=attributes&layer=tiles&zoom=19&band_count=3, the HBaseAttributeStore fails to obtain attributeTable param passed in the URI.

The HBaseAttributeStore is created here: https://github.com/geotrellis/geotrellis-server/blob/main/ogc/src/main/scala/geotrellis/server/ogc/OgcSource.scala#L140-L143

When the program reaches here:

def attributeStore(uri: URI): AttributeStore = {
    val instance = HBaseInstance(uri)
    val params = UriUtils.getParams(uri)
    val attributeTable = params.getOrElse("attributes", HBaseConfig.catalog)
    HBaseAttributeStore(instance, attributeTable)
  }

https://github.com/locationtech/geotrellis/blob/master/hbase/src/main/scala/geotrellis/store/hbase/HBaseCollectionLayerProvider.scala#L38-L43 the provided URI is parsed to only retain "hbase://znkgtz01:2181". As a result, the correct attributeTable parameter "attributes" cannot be obtained, and the default value of "metadata" is consistently used.

Expected behavior

The program should be able to read the attributeTable parameter from URI and retrieve data from the corresponding HBase table: gt+hbase://znkgtz01:2181?master=znkgtz01&attributes=attributes&layer=tiles&zoom=19&band_count=3

Environment

Additional context

Is this a bug in Geotrellis or is there an issue with my URI format?

pomadchin commented 7 months ago

Hi @RunBoo , have you tried hbase://zookeeper[:port][?master=host][?attributes=table1[&layers=table2]?

https://github.com/locationtech/geotrellis/blob/master/hbase-spark/src/main/scala/geotrellis/spark/store/hbase/HBaseSparkLayerProvider.scala#L28-L29

But I can check it for you. It is sad that we don't have a good enough coverage for HBase.

pomadchin commented 7 months ago

You're totally right, GeoTrellisPath just does not support it!

RunBoo commented 7 months ago

@pomadchin Yes, I've tried hbase://znkgtz01:2181?master=znkgtz01?attributes=attributes&layer=tiles&zoom=19&band_count=3 gt+hbase://znkgtz01:2181?master=znkgtz01?attributes=attributes&layer=tiles&zoom=19&band_count=3 or gt+hbase://zookeeper:2181?master=znkgtz01?attributes=attributes&layer=tiles&zoom=19&band_count=3 and so on. (where znkgtz01 is my IP address)

and results are same to:

https://github.com/locationtech/geotrellis/blob/master/hbase/src/main/scala/geotrellis/store/hbase/HBaseCollectionLayerProvider.scala#L38-L43 the provided URI is parsed to only retain "hbase://znkgtz01:2181". As a result, the correct attributeTable parameter "attributes" cannot be obtained, and the default value of "metadata" is consistently used.

pomadchin commented 7 months ago

@RunBoo please take a look into the https://github.com/locationtech/geotrellis/pull/3529 Could you verify if it works for you?

RunBoo commented 7 months ago

@RunBoo please take a look into the #3529 Could you verify if it works for you?

Okey, Thanks a lot. I'll try it.

@pomadchin The "zookeeper" here should be IP address or DNS. image

RunBoo commented 7 months ago

@RunBoo please take a look into the #3529 Could you verify if it works for you?

@pomadchin It works for me. I'll close this commet. Thank you~