hotosm / oam-dynamic-tiler

Dynamic tiling of raster data for OpenAerialMap + others
Other
35 stars 12 forks source link

Support for imagery w/ units other than meters #9

Open mojodna opened 7 years ago

mojodna commented 7 years ago

This affects calculation of approximate zoom in get_zoom.py, which assumes meters. It previously used rasterio to transform the bounds and generate an affine transform, but this didn't work on large images.

It's probably best to transform to web mercator (EPSG:3857) if units are not meters and use those "meters" as the resolution.

See also hotosm/oam-uploader-api#42.

mojodna commented 7 years ago

Sample image: http://hotosm-oam.s3.amazonaws.com/uploads/2016-05-16/573a17d4cd0663bb003c32b6/scene/0/scene-0-image-0-0B5OqSBHBRDanMk42OGx6TE9BZlk

mojodna commented 7 years ago

17 covers geographic coordinates. Projections with units = feet still need to be tackled (it's in the WKT for a CRS, but needs to be parsed out).

mojodna commented 7 years ago

PROJ.4 strings implicitly use meters (+units=m) when projected. If +to_meter is included, it provides a conversion factor to, um, meters. This is usually how CRSes that use feet or kilometers are defined.

A quick survey of PROJ-related Python modules didn't reveal any that easily expose parameters.

mojodna commented 7 years ago

get_zoom.py now includes conversion from geographic coordinates to meters.

However, the metadata is generated w/ oin-meta-generator, which uses https://github.com/scisco/gdalinfo-json under the hood, so that's actually what needs to be adjusted to correctly reflect units in the metadata.

tombh commented 7 years ago

Hi Seth, I'm going to be taking a look at this. Let me just describe my current understanding:

There are 2 parts.

  1. Firstly the get_zoom.py query during transcoding needed to meet the expectation that all resolutions were in metres. This was important simply because without it then resulting imagery is distorted. This is fixed now for all CRSs using degrees or metres, but not feet (relatively a somewhat trivial problem at the moment). And this was addressed in your #17 fix. However, does this need to be applied retroactively to all existing transcoded imagery before the merged PR date?

  2. The gsd field in the metadata file (also generated during the process.sh run) still assumes that all values are in metres. This line in oin-meta-generator does not convert to metres depending on the source units: like I say it always assumes metres. Again I assume any fix needs to be retroactively applied to all existing imagery (or at least imagery not using metres)? Note that this point is a direct reference to hotosm/oam-uploader-api#42

smit1678 commented 7 years ago

@tombh Here is the matching line that the oin-meta-generator calls from https://github.com/scisco/gdalinfo-json.

Regarding retroactively applying, yes I think we can come back to that to fix the existing imagery. @mojodna may have an existing script that he's used in the past to do metadata updates.

tombh commented 7 years ago

If my assumptions in my previous comment are correct, then these are the potential solutions I propose:

  1. Very quick fix: before this line in the oin-meta-generator npm package shell out to something like gdalsrsinfo -o proj4 $FILENAME | grep -o 'units=[a-zA-Z]*' | cut -f2- -d=. Shelling out is generally something that should be avoided. But also this makes the assumption that the proj4 string will be consistently formatted.

  2. Update the npm gdalinfo-json package to also parse the PROJCS data. Although this is a more robust solution, formally supporting proj4's WKT is a significant undertaken, involving far more than just scraping the units value. What's more, proj4 parsing is already comprehensively supported by libgdal itself, so writing Yet Another Parserâ„¢ seems an unnecessary burden both in the building and maintenance.

  3. Swapping out gdalinfo-json for the defacto nodejs libgdal bindings: node-gdal. This is certainly the most comprehensive solution. However it would require a somewhat significant rewrite of oin-meta-generator - though I suspect such an effort would be roughly equivalent to the effort to support proj4 strings in gdalinfo-json. Except at the end oin-meta-generator would be integrated with the best possible, well in fact the official, ABI. Though I wonder why this route wasn't taken in the first place? Perhaps gdalinfo-json's author @scisco can shed some light on that?

mojodna commented 7 years ago

get_zoom.py is used during the prep process to figure out what the "approximate zoom" for a dataset should be set to. It's entirely distinct from anything that shows up in the OIN metadata and is only used within the tiling pipeline.

The gsd field (part of the OIN metadata) should always be in meters. However, due to gdalinfo-json's implementation, it will be in whatever units the data's coordinate reference system specifies.

Thus, the only thing that needs to be modified in the S3 bucket is the OIN metadata. I have scripts kicking around to do ad-hoc edits like that over everything in the bucket. Once oin-meta-generator does the right thing, let's work together on Slack to fix the affected metadata.

hotosm/oam-uploader-api#42 is actually the correct issue for this. From the tiler's perspective, this has been fixed (so I'm going to close it).

mojodna commented 7 years ago

(Sorry, just spotted your most recent comment)

1: ick, but sure, that seems like it would work. 2: http://proj4js.org/ may be helpful here. Let's stick with glue (even if it means shelling out); no need to write anything substantial. 3: gdalinfo-json uses node-gdal under the hood already, so it's more a matter of checking the CRS to see if it's geographic (degrees) or not.

mojodna commented 7 years ago

(reopening because the tiler doesn't correctly handle CRSes where units are feet)