codemonger-io / dogs-business

Track and share your dog's business
0 stars 1 forks source link

Request only nearby and recent business records #35

Open kikuomax opened 3 years ago

kikuomax commented 3 years ago

Requesting all of the business records online will incur too much network traffic and AWS charge. We have to limit the request.

Basic strategy,

kikuomax commented 3 years ago

This means we have to able to query business records by geolocation. This article may help.

kikuomax commented 3 years ago

S2 Geometry Library looks great. But we have to consider compatibility between the map tile coordinate system supported by Mapbox (maplibre).

kikuomax commented 3 years ago

Since the search region depends on the visible area (zoom level) on the map, I think a single geohash is not sufficient for our purpose. Ideally, every business record should be indexed by individual zoom levels.

kikuomax commented 3 years ago

According to the DynamoDB quota, DynamoDB can have at most 20 global secondary indexes per table. Not all of the zoom levels supported by Mapbox (0 to 22) fit in this limitation.

kikuomax commented 3 years ago

We have to determine some typical zoom levels.

kikuomax commented 3 years ago

I use zoom levels 17 and 18 with my mobile phone during I walk my dog. I think zoom level 19 is close enough to determine the precise location of the business record. I often use zoom levels 15 and 16. I sometimes use zoom levels 11 to 14. I do not think zoom levels 0 to 10 make any difference.

kikuomax commented 3 years ago

Zoom levels to index,

Above indexing is enough for dog walking. Further indexing is necessary for browsing. Unfortunately I have no clue about it.

Indexing the level 0 is not necessary because it is equivalent to scanning every record.

kikuomax commented 3 years ago

7 global secondary indexes should not harm, but consume more WCUs and RCUs.

kikuomax commented 3 years ago

To work around the limitation of the number of global secondary indexes, we could store an additional item per zoom level in the table, that associates a business record with its map tile coordinates at a specific zoom level. A primary key combination would be

Since this method needs more put_item requests, it will be more error-prone.

kikuomax commented 3 years ago

I prefer global secondary index as long as the quota does not matter.

kikuomax commented 3 years ago

To query business records of a specific dog at a zoom level z, here are the requirements for a primary key combination of the index corresponding to the zoom level z,

kikuomax commented 3 years ago

There are two different sets of requirements for a primary key combination, one for a specific dog (private view), and the other for all dogs (public view; i.e., #38).

kikuomax commented 3 years ago

One solution is to create 7 more similar indexes for the public view. This consumes precious indexes.

kikuomax commented 3 years ago

Another solution is to create separate items for public and private views. The two views share indexes but have different prefixes.

Private item,

Public item,

kikuomax commented 3 years ago

By the way, isn't it a bad idea to have a huge partition in DynamoDB? The map tile indexing I proposed here will create a huge partition especially for lower zoom levels.

kikuomax commented 3 years ago

When I googled about the problems having a huge partition in DynamoDB, I found a scary article said the partition size is up to 10GB! But according to the documentation, this limit is applied only to a table with one or more local secondary indexes. So this should not matter to the business record table.

kikuomax commented 3 years ago

A huge partition may be more susceptible to the capacity cap per partition (3,000 RCUs and 1,000 WCUs). It should not matter for our app so far.

kikuomax commented 3 years ago

I found that CloudFormation cannot create or delete more than one global secondary index in a single update. This is painful. https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/229

kikuomax commented 3 years ago

We have to provision global secondary indexes (GSIs) one by one. I hope I am not stupid enough to edit the CDK stack every time I provision a single GSI. May we use a context value to control which GSI is going to be provisioned?

kikuomax commented 3 years ago

How do we get map tiles visible on the screen? We can listen for "sourcedata" event to know which map tile is requested, but no event is notified after the map tile is cached.

kikuomax commented 3 years ago

How do we get map tiles visible on the screen? We can listen for "sourcedata" event to know which map tile is requested, but no event is notified after the map tile is cached.

My concern is that once business records in a map tile are queried at a "sourcedata" event, they will not be re-queried unless mapbox cache is cleared. But this should not matter unless you want to monitor business records of your dog friend updated by other than you. Because updates made by you are immediately recorded on memory.

kikuomax commented 3 years ago

We have to invent our own caching feature though, use "sourcedata" events for now.

kikuomax commented 3 years ago

Because not all of zoom levels are indexed, use the following algorithm to cover a queried map tile at (x, y) at a zoom level z.

  1. Use exact z, x and y if z is an indexed zoom level.
  2. Use z-1, floor(x/2) and floor(y/2) if z-1 is an indexed zoom level.
  3. Use z-2, floor(x/4) and floor(y/4) if z-2 is an indexed zoom level.
  4. and so on

zoom-level-covering

kikuomax commented 3 years ago

One problem of depending on a "sourcedata" event is that the maximum zoom level is capped by that of the event (tile) source. It is 16 in the case of the style mapbox://styles/mapbox/streets-v11.

kikuomax commented 3 years ago

A global business explorer (#38) will be added as a map tile source in the future. For finer zoom levels, I think we can count on it.

kikuomax commented 2 years ago

I realized that the zoom level zero also has to be indexed because there is no index scanning all of public records.