[Utility] Decode image pixel geometry

cbeddow commented 3 years ago

Is your feature request related to a problem? Please describe. When requesting an object detection, a field called geometry is returned. This contains the pixel coordinates of the detection. For example, it contains the outline of a building which was detected in the photo with x,y coordinates in the image. It is encoded as a vector tile, so needs to be decoded in order to be used.

Example: https://graph.mapillary.com/1933525276802129/detections?fields=id,geometry&access_token=MLY|XXX

Describe the solution you'd like A function should use the decode() method of mapbox-vector-tile library to convert the geometry field to a JSON. This would be a utility that can be used in another function, something like get_detections(image_key).

Describe alternatives you've considered This could also be a public method, but makes more sense as a utility that can be reused when getting information about the detections in either an image key or a map feature.

Additional context Detections are the segmentation in the image, where a class of object was detected, such as sidewalk, road, building, traffic sign, vegetation, or bench. Multiple detections are used to form point data, such as a bench with a longitude and latitude. In the API you can pass graph.mapillary.com/IMAGE_ID/detections to get a list of everything detected in an image, or graph.mapillary.com/MAP_FEATURE_ID/detections to get a list of all detections that were determined to be the same object from multiple angles, then combined to predict a point location. Users often want to get the detection pixel coordinates in order to draw this on top of a JPG or in the MapillaryJS image viewer library, to highlight a portion of the image that matches an object displayed on a map.

Rubix982 commented 3 years ago

This makes a lot of sense. Definitely, we need a utility similar to the decode() function here. Thanks, @cbeddow !

cbeddow commented 3 years ago

Here is a full code example for decoding:

import base64
import mapbox_vector_tile

base64_string = "Gjh4AgoGbXB5LW9yKIAgEikIARgDIiMJxCXQFHIMAAocACAHFAkIUQMLGQQvCAkcBQwTDAIGCggADw=="

data = base64.decodebytes(base64_string.encode('utf-8'))

detg = mapbox_vector_tile.decode(data)

print(detg)

# {'mpy-or': {'extent': 4096, 'version': 2, 'features': [{'geometry': {'type': 'Polygon', 'coordinates': [[[2402, 2776], [2408, 2776], [2413, 2762], [2413, 2746], [2409, 2736], [2404, 2732], [2363, 2734], [2357, 2747], [2359, 2771], [2363, 2776], [2377, 2779], [2383, 2789], [2389, 2788], [2392, 2783], [2396, 2783], [2402, 2776]]]}, 'properties': {}, 'id': 1, 'type': 3}]}}

I think we should have an option normalized=true to just return the 'coordinates' attribute, and do not need extent, version, features, etc. We also should normalize the coordinates, so the first set [2402, 2776] should be normalized by dividing by the the height and width of the original image fields=height,width. Many users prefer this format, but if they want the raw could set the function to normalized=false. I think this normalized format is required to be compatible with our Mapillary viewer library in Javascript, but I need to confirm this.

By default then would maybe run get_detections(image_key, normalized=true) and it would return a list of a JSON objects. where"geometry" : "Gjh4AgoGbXB5LW9yKIAgEikIARgDIiMJxCXQFHIMAAocACAHFAkIUQMLGQQvCAkcBQwTDAIGCggADw=="is replaced with"geometry" : {"coordinates" : [[[x,y], [x,y], [x,y]]] }` in normalized values.

cbeddow commented 3 years ago

Here is some extra context on "basic coordinates" which are normalized, compared to the raw coordinates we have currently: https://mapillary.github.io/mapillary-js/docs/theory/coordinates/#basic-image-coordinates

We should be aware that the Mapillary API may change soon to return basic coordinates also, in which case this module will not be necessary, but we can still build it now if it is not too much trouble, or leave it for a few weeks to see.

Rubix982 commented 3 years ago

It shouldn't be too big of a hassle, I think, to implement the normalization aspect.

mapillary / mapillary-python-sdk

[Utility] Decode image pixel geometry #32