WelcometoMyGarden / welcometomygarden

Web app of Welcome To My Garden, a not-for-profit network of citizens offering free camping spots in their gardens to slow travellers.
https://welcometomygarden.org
GNU Affero General Public License v3.0
105 stars 16 forks source link

Optimize amount of reads when fetching gardens #54

Open archived-m opened 4 years ago

archived-m commented 4 years ago

Currently every time the map is loaded, every garden is one read in firestore. As the amount of gardens increases, it would make sense to optimize these reads in two ways:

th0rgall commented 1 year ago

I think solving this issue within Firebase is still interesting, given that the substantial work proposed in #106 has not been advanced yet, and that this issue is probably the most cost-inducing inefficiency in WTMG at the moment.

@micleyman's opening post is a great starting point. For reference, I'll tag the two proposed solutions as Firebase solutions A: geohashes, and Firebase solution B: aggregation.

Solution A - Geohashes: not for now

First some context:

This means that most of our visitors want to see most of our database of gardens on map, which would make geohashes less useful.

Geohashes would be valuable and even necessary when…

But these cases don’t seem to be here yet, or are at least not the most important cost-drivers at the moment.

Solution B - Aggregation: maybe

Google does not support collection aggregations natively, but details two methods to achieve it https://firebase.google.com/docs/firestore/solutions/aggregation.

The basic idea is, as Michiel suggests, to combine all gardens into a single or a few readable objects, whenever a garden gets added to the garden collection. The above docs suggest to do so either in a modification transaction, or in a cloud function trigger by modifications.

Where to store the aggregation? If I understand the idea correctly, we would use a Firebase document or documents in a new collection to aggregate the JSON data into.

To get an idea of the size, all garden data encoded in a stringifed UTF-8 JSON file is currently 1.7 MB: it would exceed the limit of a single Firebase document (apparently around 1 MB), but we could use several in a kind of “linked list”. When we strip info like the description that is not immediately needed for the overview map, we get a json of 812 KB. This might currently just fit into one Firebase document. However, when storing key-value data in a Firebase Document, Firebase is likely more storage-efficient than stringified JSON, which gives extra leeway.

Would we want to optimize this? It would be interesting to know the current rate of garden modifications, and if the envisioned “viewing costs” + “update costs” over a certain interval are indeed much lower than the “viewing costs” + “update costs” currently, though I highly suspect they are are.

It should be advisable to “debounce” the aggregation updates though, to prevent extra costs from a series of changes that happen in short succession.

Solution C - extracted garden microservice

106 is still an attractive long-term solution, and I think that extracting a “garden microservice” from Firebase (while keeping the rest) with Postgres/GIS or other services better suited to geoquerying would be an interesting first step towards self-owned, cost-efficient microservices. Chat seems currently well-served by Firebase, and authentication too if we could integrate it with an external microservice.

We (Slowby) need to consider more how much time we would spend on each of these solutions, versus their benefits, to decide in which solution we want to invest our time.