hotosm / oam-uploader-api

DEPRECIATED - The OAM Uploader API server
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Resolution is not calculated correctly #42

Closed cgiovando closed 3 years ago

cgiovando commented 7 years ago

All the imagery that was uploaded recently for Hurricane Matthew response shows RESOLUTION as 0.

Previous images like these UAV ones in Tajikistan, show the resolution correctly.

smit1678 commented 7 years ago

To start dig into this deeper:

Recent DG imagery showing 0 gsd (metadata file here):

"gsd":0.0000044878248024438965

UAV imagery with correct gsd (metadata file here):

"gsd":0.16927

So this isn't a frontend issue but something to do with how the gsd is calculated during metadata generation. Need to look further at the geotiffs.

mojodna commented 7 years ago

It's a units thing. The DG imagery is in EPSG:4326, so that rather low number is in degrees rather than meters.

mojodna commented 7 years ago

Short version of comments in hotosm/oam-dynamic-tiler#9: gdalinfo-json needs to check the units in the data's CRS and not just assume that they're meters. If they're not meters, they need to be converted to meters appropriately and slotted into the gsd field.

tombh commented 7 years ago

Carrying on the thread from hotosm/oam-dynamic-tiler#9

@mojodna said:

gdalinfo-json uses node-gdal

Oh! That is both clarifying and confusing. I just saw the exec() shelling to gdalinfo /vsicurl/ and assumed that that was approach taken by the whole module. So what's confusing is that node-gdal binds directly to libgdal's C interface. So node-gdal has methods such as getLinearUnits(), or even more powerfully getAttrValue(). This is because libproj4 is a dependency of libgdal.

So the shelling and regexing in gdalinfo-json is curious considering those native, robust and simple libgdal methods are literally a few keystrokes away. I would assume that either the author is/was unaware of those method calls or has a compelling justification for not using them.

mojodna commented 7 years ago

A potential compelling justification is that "expensive" libgdal methods aren't async in node-gdal, so shelling out to gdal effectively achieves that.

tombh commented 7 years ago

Oh I don't understand. Expensive as in they take a long time? I've never experienced that. And also what's a specific example where aysynchronously calling gdalinfo would be useful? Even then I would have thought the responsibility of concurrency should be separated somewhere else, namely OS-level processes, like a worker.

mojodna commented 7 years ago

node-gdal hasn't yet implemented threadpooling of heavy I/O processes, so it doesn't follow Node.js norms. (Not saying that it would be useful the way it's used now, just that that might have been the consideration when shelling out via exec.)

tombh commented 7 years ago

Hmm, I understand what you're saying. But I just can't imagine that being the reason.

So right now, I don't see the use case for gdalinfo-json. It would be better to just depend on node-gdal in oin-meta-generator and get the relevant data using the libgdal interface. No need to prematurely optimise for concurrency.

mojodna commented 7 years ago

Totally, and it's not used in an interactive setting either (usually as a command line tool where it's not competing with anything else).

tombh commented 7 years ago

Yeah agreed. Ok, hopefully I can get this fixed next week then. Thanks Seth.

tombh commented 7 years ago

I can't remember if subscribers get notifications for related commits? Anyway, @mojodna, there's a new commit, it's rather significant, what are your thoughts?

mojodna commented 7 years ago

Not unless I'm explicitly mentioned in the commit. Thanks, I'll have a look!

tombh commented 7 years ago

Should we keep this issue open until the old imagery meta data has been updated, as per the original comment about the Hurricane Matthew imagery? That will require manually running scripts that incorporate the oin-meta-generator fix.

mojodna commented 7 years ago

From Slack:

#!/usr/bin/env bash

set -eo pipefail

aws s3 ls s3://${1}/ --recursive | \
  grep \.tif$ | \
  awk "{print \"s3://${1}/\" \$4}" | \
  while read src; do
    echo $src
    filename=$(basename $src)
    aws s3 cp $src .

    set +e
    aws s3 cp ${src%.tif}_meta.json orig.json
    set -e

    oin-meta-generator $filename > new.json

    if [[ -e orig.json ]]; then
      jq -s '.[0] * .[1]' orig.json new.json > meta.json
      aws s3 cp meta.json ${src%.tif}_meta.json
    else
      aws s3 cp new.json ${src%.tif}_meta.json
    fi

    rm -f $filename orig.json new.json meta.json
  done

Usage: backfill-metadata.sh <bucket>

Should be run from us-east-1 because it needs to download files from S3 in order to generate metadata from them.

tombh commented 7 years ago

Thanks Seth. just waiting to iron out oin-meta-generator issues...