Open archived-m opened 4 years ago
I think solving this issue within Firebase is still interesting, given that the substantial work proposed in #106 has not been advanced yet, and that this issue is probably the most cost-inducing inefficiency in WTMG at the moment.
@micleyman's opening post is a great starting point. For reference, I'll tag the two proposed solutions as Firebase solutions A: geohashes, and Firebase solution B: aggregation.
First some context:
This means that most of our visitors want to see most of our database of gardens on map, which would make geohashes less useful.
Geohashes would be valuable and even necessary when…
But these cases don’t seem to be here yet, or are at least not the most important cost-drivers at the moment.
Google does not support collection aggregations natively, but details two methods to achieve it https://firebase.google.com/docs/firestore/solutions/aggregation.
The basic idea is, as Michiel suggests, to combine all gardens into a single or a few readable objects, whenever a garden gets added to the garden collection. The above docs suggest to do so either in a modification transaction, or in a cloud function trigger by modifications.
Where to store the aggregation? If I understand the idea correctly, we would use a Firebase document or documents in a new collection to aggregate the JSON data into.
To get an idea of the size, all garden data encoded in a stringifed UTF-8 JSON file is currently 1.7 MB: it would exceed the limit of a single Firebase document (apparently around 1 MB), but we could use several in a kind of “linked list”. When we strip info like the description that is not immediately needed for the overview map, we get a json of 812 KB. This might currently just fit into one Firebase document. However, when storing key-value data in a Firebase Document, Firebase is likely more storage-efficient than stringified JSON, which gives extra leeway.
Would we want to optimize this? It would be interesting to know the current rate of garden modifications, and if the envisioned “viewing costs” + “update costs” over a certain interval are indeed much lower than the “viewing costs” + “update costs” currently, though I highly suspect they are are.
It should be advisable to “debounce” the aggregation updates though, to prevent extra costs from a series of changes that happen in short succession.
We (Slowby) need to consider more how much time we would spend on each of these solutions, versus their benefits, to decide in which solution we want to invest our time.
Currently every time the map is loaded, every garden is one read in firestore. As the amount of gardens increases, it would make sense to optimize these reads in two ways: