MichaelSolati / geofirestore-js

Location-based querying and filtering using Firebase Firestore.
https://geofirestore.com
MIT License
505 stars 58 forks source link

Slow Performance on 360k Records - Running as Node.js function #55

Closed stevenyix closed 5 years ago

stevenyix commented 5 years ago

I'm using geofirestore in a firestore cloud function with an http endpoint to perform radius search - return records within the radius of a lat/lon coordinate.

This query runs against a document collection with 360k records.

The performance is quite slow - queries are taking 12-14 seconds according to the Cloud Functions log. In comparison, I created a small Cloud SQL Postgres database and Express.js web API and it’s returning identical queries in 300ms.

I have one composite index created, otherwise it’s using default setting. The composite index I created is: "d.locationType ASC g ASC".

Below is a screenshot of the document structure. I added a few child fields in the 'd' field created by Geofirestore.

I posted this question on the google-cloud-firestore-discuss@googlegroups.com discussion group, and one of the Firestore engineers suggested reaching out to you - but would be willing to talk with the geofirestore developers and possibly help. I'll point him to this issue.

Any guidance on how to optimize this to run - ideally <1 sec?

image

samtstern commented 5 years ago

cc'ing myself here: curious to know how this library works and if there are possible optimizations!

MichaelSolati commented 5 years ago

@stevenyix hopefully I can try to address and figure out what is going on (and we can go from there). So there may be a lot of moving factors here, where I would love to see a code snippet to see how your cloud function works. From there I can hopefully address any performance issues (and maybe v3 might be better suited for you). Issue #38 addressed a similar concern where it seemed as if the data was being modified all the time, so the ready event never triggered, so knowing what's going on here would be very important.

Hey @samtstern, real quickly this works almost exactly how geofire works, except the guts have been reworked to use Firestore. Effectively we generate geohashes for points around the center, including for the center, and then run onSnapshot on each query generated. Since the geohash should have severely limited the items being returned then we just do a quick check on the distance between the origin and the queried doc, and if it's in range we will fire the callback to return the doc to the user.

stevenyix commented 5 years ago

@MichaelSolati thanks for taking a look at this. When I ran my queries none of the data was being modified.

Here are links to 2 gists:

This just creates my cloud function endpoint, takes the request and makes the call to a geofirestore query: https://gist.github.com/stevenyix/448f407cf14579f0f641bbd2348ed3ca

This contains the call to geofirestore. geoSearch() is the primary function. The implementation is a bit janky because it doesn't appear that geofirestore supports async/await or promises, so I put in a 1ms sleep() function that prevents the function from exiting until the 'ready' event changes a boolean flag to indicate the query is complete. If there's a more efficient way to do it - please let me know!

https://gist.github.com/stevenyix/e2e3e06ba5e574c45d159beb7925ea09

Thanks again.

MichaelSolati commented 5 years ago

Hey @stevenyix so I'm not 100% sure if the issue exists with the library, but I did some small changes to your function that will hopefully optimize it (let me know if it helps)

https://gist.github.com/MichaelSolati/c44d60126044778b5ee15054efc17d44

alexandregiordanelli commented 5 years ago

https://github.com/geofirestore/geofirestore-js/issues/63

The problem is that library is downloading 360k record and filter after on clientside..

MichaelSolati commented 5 years ago

@alexandregiordanelli the library does not in fact download the entire 360k record. It does an initial filter based on geohashes via firestores startat and endat functions and then does another check on client in case the document is off a little (as is the nature with geohashes)

MichaelSolati commented 5 years ago

@stevenyix closing this as it's been almost two weeks since I last heard any update from you, please feel free to comment back if my solution didn't work for you.