mapillary / mapillary-python-sdk

A Python 3 library built on the Mapillary API v4 to facilitate retrieving and working with Mapillary data.
MIT License
41 stars 16 forks source link

[User Story] 2. Images & Traffic Signs #93

Open Rubix982 opened 3 years ago

Rubix982 commented 3 years ago

Context A user who uses a camera to take images all over the city would like to,

  1. Get a list of images uploaded to their organization (request to the image coverage tiles, then filter by the organization)
  2. Filter for images during a certain date range, like last week (so apply a date filter).
  3. Then they want to get all traffic signs visible in these images (so must really then call the same tiles from the traffic signs layer, using the same z/x/y)
  4. Then filter by calling graph.mapillary.com/TRAFFIC_SIGN_ID?fields=images, and keep only the traffic signs where at least 1 image in the images list matches a universal list of images from the tile request that uses date and organization filter

Concern Each of these red boxes is 1-degree longitude by 1-degree lat. Maybe the max can be 2x2 degrees. And the user could set an override maybe, but by default, we reject if the area is 2x2 = 4 square degrees for a bbox.

For a radius, simply take area = (2* radius) ^ 2. And reject if area > 4 sq degrees also. But radius may be in meters, we need to have a handler of that too. The user should input radius in meters which we need to convert to degrees maybe.

Red boxes

Approaches/Limitations Interestingly, this is not a function to get data by a bounding box, nor an input coordinate with a radius search. It has no geographic input argument, only organization name/key, and date range. So maybe a get_data function needs some special design.

It must have some argument that limits the search,

  1. The bounding box (with max size maybe, like a bounding box for all of Africa may be too big for the API or server)
  2. A radius around a coordinate (and max radius, like 200 meters is okay but 25,000km is too much), a date range (is their one that is too big?)
  3. An organization key for example, if I just do get_data(<date range from 2014 to 2021>) it would try to pull all global data from all time Mapillary existed. Too big. But the API would return a 500 probably, or deny access with a rate limit.
  4. We also need to say that you need either a bbox, a radius, an organization key, or a date range, or a combination, but you cannot have any. So all these arguments are optional as long as 1 exists as a minimum, but you cannot call a get_data function with no arguments. So thinking about how to check that.
  5. The other limitation then is just on bounding box, radius, or date range size maybe. For example,
    1. If bbox is within the size limit, then the date range can be anything
    2. If bbox and date range both exceeds the size limit (box size of Africa, date of 3 years), reject
    3. If bbox is the size of Africa, but the date range is 1 week, accept
    4. Same with radius. No limits on the organization as a parameter.

I will ask for some tips on bbox and radius size limits. For bbox, maybe an area limit, so abs(max_x - min_x) * abs(max_y - min_y) must be < limit. Then x,y of bbox are in degrees though, not meters. But that's okay. Maybe 1x1 degree is the limit, so 1 square degree.

Code Example [OPTIONAL] Something like,

get_data(layer=traffic_signs,filter=[organization=XXX, min_date=YYYY-MM-DD HH:MM:SS, max_date=YYYY-MM-DD HH:MM:SS])

Inside this is a function that gets a list of all image keys that are in the organization, and have the date range like valid_images=[key1,key2,key3,...keyN], but also the tile coordinates like tiles = [ [x,y], [x,y], [x,y] ]

If there are 3 tiles (maybe just 1, maybe 20?). Then get a list of all traffic sign IDs in the same tile coordinates, and make an API request for each traffic sign ID (that's a lot of computing, unfortunately)

If there are 500 signs, for example, with fields=images, and then a condition of if any image keys in the traffic sign. Images field is in the valid_images list, then push the traffic sign with all fields including geometry to the output file

Rubix982 commented 3 years ago

@cbeddow how does this look? :smile:

cbeddow commented 3 years ago

@Rubix982 looks good, do you think we need to do much more to make this possible or it's already possible?

Rubix982 commented 3 years ago

For the steps,

Get a list of images uploaded to their organization (request to the image coverage tiles, then filter by the organization)

The user can specify bounding boxes or [lng, lat] and query against a provided org_id

Filter for images during a certain date range, like last week (so apply a date filter).

All interfaces have a date range filter

Then they want to get all traffic signs visible in these images (so must really then call the same tiles from the traffic signs layer, using the same z/x/y)

We can call traffic_signs_in_bbox by generating a bbox from the GeoJSON and sending it as a parameter.

Then filter by calling graph.mapillary.com/TRAFFIC_SIGN_ID?fields=images, and keep only the traffic signs where at least 1 image in the images list matches a universal list of images from the tile request that uses date and organization filter

The former part is easy, the latter part's "matches a universal list of images" is vague. Does this mean image Point features in a GeoJSON?

@cbeddow, I think it can be done with what we have. We should update this as an example on our documentation as a series of more complex operations.