Closed jpolchlo closed 6 years ago
@jpolchlo Thanks! I'll start looking at this today/tomorrow
As far as testing is concerned, we have been using locationtech-labs/geopyspark to produce TMS servers. We can work together to build a test notebook if you need.
@jpolchlo this is great and thank you for the contribution.
@jpolchlo still trying to wrap my head around this a little do you have an example of the kind of object that TMSRasterData
is supposed to wrap? Specifically it seems like the primary motivation for the bind/unbind functionality is handling randomly assigned port values?
If you could give me a better sense of what goes on inside the unbind function that TMSRasterData
calls I think it would help me understand a little better. If i'm not mistaken needing to call unbind()
is what motivates the introduction of disgorge
and the _server_state
variable passing through kernel.py/layer.py?
Yes, disgorge
and _server_state
are there to support clean shutdown of the TMS servers—not specifically to handle the random port assignment, though. Conceivably, the vis server could pass through a port assignment. It also seemed like a reasonable way to introduce state to the vis servers, in case that is useful in the future. I half expect that there's something I didn't consider in the system design that makes this dangerous.
The TMS object that I'm working with defines unbind()
as in https://github.com/locationtech-labs/geopyspark/blob/master/geopyspark/geotrellis/tms.py#L176-L184, which calls out to https://github.com/locationtech-labs/geopyspark/blob/master/geopyspark-backend/geotrellis/src/main/scala/geopyspark/geotrellis/tms/Server.scala#L44-L48 (which in turn calls to the Scala code in that directory). It is entirely intended to shut down the server, free ports, and generally clean up after itself. In this case, there are processes in the background doing request aggregation so that we make fewer calls out to Spark that need to be terminated.
This is a great idea; I hadn't noticed the option of creating a new reader type, but it's fairly clear that it's the right thing to do. It may also resolve some of the questions below regarding bind
and the TMS API. If we were to use RasterData('http://localhost:<port>/.../{z}/{x}/{y}.png')
(though the argument for a tms
schema is readily apparent), then it removes the responsibility from the reader class to start up the TMS endpoint, as it obviously needs to exist before that call can be made. In this case, using GeoPySpark constructs, adding a TMS layer will take the following form
tms = TMS.build(...)
tms.bind()
M.add_layer(RasterData(tms.url_pattern), display_option_1, display_option_2)
while removing it will just require M.remove_layer
to be followed by tms.unbind()
.
The problem with such a solution is that TMS objects can be lost easily in this manner, and the ports continue to run backed by data with references that can never be released. It's easy to suggest that the user needs to take ultimate responsibility, but in an interactive environment with rapid iteration, it should be expected that, unless it's rolled into the reader, these objects will be mishandled. That's the motivation for the ingest/disgorge pairing managing the bind/unbind process. So, for sake of ease of use, and in the interest of reducing the bookkeeping burden from the user, figuring out how to make RasterData(tms)
into valid syntax is worthwhile. If we must rely on a string argument so that the schema can be extracted and routed through the entry point mechanism, then some significant ease of use will be lost. It already seems as if VRTs get some special treatment inside the Ktile vis server, so figuring out how special display objects can generalize would be worthwhile. I'd love to get your thoughts on this.
I think in either case, documentation of the reader interface would be of use.
And, yes, it seems that if I move over to using the appropriate reader structure, then we will remove the need for state to be carried, and that change can be reverted.
This PR makes it possible to display map layers that are furnished by a remote service with a TMS-like web interface. With this change, any data-handling back end that implements a simple protocol may display map data in the GeoNotebook web interface.
[Note: What we are calling "TMS" may not adhere strictly to the definition of a TMS server. We'd like to name this interface in a way that is consistent with the community's conventions, such that we don't create confusion further down the line. We'll take recommendations regarding the proper nomenclature to use.]
The interface for the TMS server assumes a standard power-of-two pyramid established over the global extent using a WebMercator projection. At zoom level n, there are 2^n partitions of the extent along each axis, indexed from 0 to 2^n-1, with the cartesian product of the two axes' partitions giving the space of (x, y) keys for which a tile may exist. [Note: At the moment, we index using the Google/Bing/OSM convention where (0, 0) corresponds to the extreme northwest tile. If necessary, one could accommodate a flag to indicate that the tiles are being served in accordance with the TMS standard of an origin in the extreme southwest.]
To add a TMS-provided layer, one needs to provide an object that adheres to the following simple interface:
Objects that are duck-typed with this interface (provide these methods/properties) may be wrapped in a
TMSRasterData
(fromgeonotebook.wrappers
) and passed toM.add_layer
as per usual. The TMS server will be bound to a port at the time of creation; removing a TMS-backed layer will unbind the service. [Note: I have not implemented special handling of the standard styling options (e.g., opacity) and have no expectation that they will have any effect. A subsequent PR could address this.]