I think solving this issue within Firebase is still interesting, given that the substantial work proposed in #106 has not been advanced yet, and that this issue is probably the most cost-inducing inefficiency in WTMG at the moment.

@micleyman's opening post is a great starting point. For reference, I'll tag the two proposed solutions as Firebase solutions A: geohashes, and Firebase solution B: aggregation.

Solution A - Geohashes: not for now

First some context:

Geohashes are a technique natively supported by Google as an alternative to geo-bounded queries: https://firebase.google.com/docs/firestore/solutions/geoqueries
Currently we have about 3412 listed gardens in the world, of which about 2650 are listed in Belgium. That’s ~78%.
Most visitors also come from Belgium, and they want to see all the gardens in their proximity: Belgium. This is why we currently center the map around Belgium https://github.com/WelcometoMyGarden/welcometomygarden/issues/55
I don’t think these ratios will change much, at least not before the next camping season.

This means that most of our visitors want to see most of our database of gardens on map, which would make geohashes less useful.

Geohashes would be valuable and even necessary when…

we keep using Firebase while growing
AND a significant part of our users wants to look at only a subset of the gardens: this happens when...
- many more gardens get added all over the world: when the 20% ratio grows larger.
- different map views like the garden-focused one https://github.com/WelcometoMyGarden/welcometomygarden/issues/56 are used more.

But these cases don’t seem to be here yet, or are at least not the most important cost-drivers at the moment.

Solution B - Aggregation: maybe

Google does not support collection aggregations natively, but details two methods to achieve it https://firebase.google.com/docs/firestore/solutions/aggregation.

The basic idea is, as Michiel suggests, to combine all gardens into a single or a few readable objects, whenever a garden gets added to the garden collection. The above docs suggest to do so either in a modification transaction, or in a cloud function trigger by modifications.

Where to store the aggregation? If I understand the idea correctly, we would use a Firebase document or documents in a new collection to aggregate the JSON data into.

To get an idea of the size, all garden data encoded in a stringifed UTF-8 JSON file is currently 1.7 MB: it would exceed the limit of a single Firebase document (apparently around 1 MB), but we could use several in a kind of “linked list”. When we strip info like the description that is not immediately needed for the overview map, we get a json of 812 KB. This might currently just fit into one Firebase document. However, when storing key-value data in a Firebase Document, Firebase is likely more storage-efficient than stringified JSON, which gives extra leeway.

Would we want to optimize this? It would be interesting to know the current rate of garden modifications, and if the envisioned “viewing costs” + “update costs” over a certain interval are indeed much lower than the “viewing costs” + “update costs” currently, though I highly suspect they are are.

It should be advisable to “debounce” the aggregation updates though, to prevent extra costs from a series of changes that happen in short succession.

Solution C - extracted garden microservice

106 is still an attractive long-term solution, and I think that extracting a “garden microservice” from Firebase (while keeping the rest) with Postgres/GIS or other services better suited to geoquerying would be an interesting first step towards self-owned, cost-efficient microservices. Chat seems currently well-served by Firebase, and authentication too if we could integrate it with an external microservice.

We (Slowby) need to consider more how much time we would spend on each of these solutions, versus their benefits, to decide in which solution we want to invest our time.

WelcometoMyGarden / welcometomygarden

Optimize amount of reads when fetching gardens #54

Solution A - Geohashes: not for now

Solution B - Aggregation: maybe

Solution C - extracted garden microservice