Slow Firestore queries on large collection

ghost commented 6 years ago

[REQUIRED] Step 2: Describe your environment

Xcode version: Xcode 10
Firebase SDK version: 5.8.1
Firebase Component: Auth, Core, Database, Firestore, Messaging, Storage
Component version: 5.8.1 (0.13.3 for FireStore)

[REQUIRED] Step 3: Describe the problem

I am facing the exact same problems as outlined in this issue (problem #1 and #2). I don't think a follow-up issue was ever created so here I go: • Querying large collections (3000-5000 docs) is fairly slow (~3-4 seconds) • With persistence enabled, sorting (by timestamp) and limiting (say 300 most recent docs) the query is even slower as the local cache queries the whole collection before sorting and limiting it... defeating the purpose of sorting and limiting it 🤓

The issue is linked to the locally persisted cache as turning persistence off takes care of the problem... That being said the persistence feature is a key reason I am using Firestore and would like to keep it and would like to avoid the operational headache of maintaining two DBs for each collection (one with all docs and a second that's read-only with a subset of the latest docs).

99% of my users are not yet facing this slow-down as their collections are not big enough but can I expect a fix to be coming in short-term (within next 30-60 days) or should I start building an alternative solution? I believe the fix would be to let the local cache index its documents/ collections (or have it download and use the index created by the server for the collections cached locally).

Steps to reproduce:

Step 1: Create a large collection of 3000-5000 documents Step 2: Attach a snapshot listener to it - with and without sorting and limiting the resulting doc snapshots.

Happy to also share my own profiling but I believe the initial post by KabukiAdam outlines the issue well.

Relevant Code:

let channelRef = UserDataService.instance.FbDB.collection("dmChannels").document(channelId).collection("messages")
               .order(by: "timeStamp", descending: true).limit(to: 300)

let messageListener = channelRef.addSnapshotListener(includeMetadataChanges: false) { [weak self](snapshot, error) in
                    guard self != nil else {return}
                    if let error = error {
                        print ("Could not download messages")
                        print(error)
                        completion(false)
                    } else {
// Do something with resulting docs
                  }

wilhuff commented 6 years ago

All the commentary on the linked issue remains relevant. I can't comment publicly on specific timelines for any changes we might be making, sorry.

Client-side indexing is planned, but is complicated by the fact that we need to integrate both remote documents and local changes into the index. We also have the problem that indexing all fields by default amplifies writes considerably. The server copes with this through parallelism, scattering the index updates across many servers. Locally we only have the one device :-).

Additionally there's the issue of how to handle composite indexes. A simple way forward would be to just pick one index or merge join on single-field indexes but this has spectacularly bad worst case performance with low-cardinality fields.

Downloading index definitions from the server runs up on the shoals of differing authorization models between the main API and the administrative API. Index definitions are not public, and there's no way to do Firebase Auth for the admin API. This could be something we could do at build time but that doesn't work particularly well for the web SDK. Changing the authorization model is non-trivial.

An alternative we've considered is to build indexes on demand (composite or otherwise). This works surprisingly well in initial usage scenarios, but makes signaling app upgrades tricky.

One thing that's coming sooner than indexing is garbage collection. We've already integrated all the bookkeeping and are currently working through threshold triggering and an API for exposing gross control over the cache size.

Basically, yes this is a known issue, and yes we want to address it, but can't promise when it will be available.

KabukiAdam commented 6 years ago

@fschaus Thanks for creating this issue. I had been meaning to create a follow-up to the original issue you linked to, but had been distracted on other tasks for a while. I came back to add it, and saw that you just did! :)

willtr101 commented 6 years ago

Well, we are looking forward on this to be solved too.

ghost commented 6 years ago

Thanks @wilhuff - I appreciate the transparency and the explanations!

I guess I'll go ahead and build a custom work around and develop an archiving logic for documents that are more than x days old on cloud functions that I'll get triggered either through a cron job or by my users' client.

Three quick questions on that: • I know firestore doesn't have a doc/ collection move function, but from experience do you think cloud functions can handle copying 1000-2000 documents in a new collection, and deleting the whole "old" collection (message that have been archived) in one go without triggering a time-out? • Should I be aware of any inherent limitations from cloud functions or firestore when copying large troves of data? Can I be guaranteed that every file will be copied once or should I build strong redundancy/ verifications in my code? • Last question - I need to decide where to create my archive collection - one option is to create a root-folder in my DB with an Archive collection, that contains Chat rooms, that contains messages. The other options is to have it in the document (Chat room) it belongs to (so Root > ChatRooms > Archive > CollectionOfArchivedMessages). If I go for the latter, I was wondering if it would impact performance of my snapshot listeners? I typically put it on the ChatRoom document (e.g., to monitor participant list, etc.) and on the message collection. Is there a performance cost from having a large collection nested in a document I deploy my snapshotlistener to? Is there a penalty from having a large collection being a sibling from another collection I deploy a snapshotlistner to? Would love to know for both server and local queries (persistence enabled).

Sorry to ask so many questions, but since I'm going to have to build this from scratch, I am just trying to have as much information as possible to avoid ending up in the same spot as before (e.g., creating an archive collection in the same document, just to realize it slows down the query just as much as before).

Thanks again! Francois

samtstern commented 6 years ago

I know firestore doesn't have a doc/ collection move function, but from experience do you think cloud functions can handle copying 1000-2000 documents in a new collection, and deleting the whole "old" collection (message that have been archived) in one go without triggering a time-out?

You can increase Cloud Functions to use up to 2GB of RAM and have up to a 540s timeout. That should give you time to move plenty of documents. You should be able to sustain 500 writes per-collection-per-second in the worst case, so that's more than 250,000 documents in 540 seconds.

wilhuff commented 6 years ago

Before you go building such a thing I'd suggest you carefully verify that it has the intended effect. Be especially careful when trying to observe the cache during testing because running a query can have side effects (a side-effect free way to do this is to use a get(source:.cache)).

The principal problem to be careful of is that Firestore only deletes documents from its cache under two conditions:

If the server tells it to do so while the client is listening (or re-listens within 30 minutes), or
when there's a mismatch between the cache contents and the query results for the same query.

So, ignoring deletions that happen in the first case, you have to be careful what you query for in order to trigger the second behavior.

For example, if you query for for all data newer than 30 days ago, it won't matter if you delete data on the server that's older than that because the client won't see a mismatch between the server response and what it has in cache. To make this work you'd need to query for more data (e.g. the whole collection) so that the client will recognize that there are documents in its cache that match the query that don't exist on the server.

In short: to make this work apply the limit implicitly by way of deleting the data rather than explicitly.

However note that garbage collection is just around the corner. You can even kick the tires now by using the gsoltis/lru_gc_disabled development branch. I can't really comment on when we're launching this for real, and I can't promise that the current state of that branch won't liquify your hard disk, but it should be safe to try out with a non-production project. We'd love your feedback!

ghost commented 5 years ago

Thanks @wilhuff for the detailed comments on the expected behavior.

This is anyway how I proceed - I always query all data from the server since querying only a subset (say last 30 days) doesn't make any difference in terms of performance (i.e., the client 'query' anyway returns every single document in a collection, before sorting/ filtering them locally on the client and only returning the one requested which is sloooooow). That's why the only viable option seems to move/ delete/ archive the documents on the server, so that when I run a full query on that collection it has a manageable set of documents and can return in less than 1sec.

You picked my interest with the local query, but I guess you can't do a local query with a snapshotlistener (my use-case)? On top as you outlined, if the local cache finds a mismatch between itself and the server, it will flush itself before redownloading all docs right? If instead it had a more subtle behavior of just getting and caching the 2-3 new docs not in cache, it could be a viable option that would still prove faster than redownloading every single docs from the server.

I also wanted to report that I tested for 2 weeks the latest release with garbage collection and did not encounter any bugs/ data corruption/ etc. That being said, it did not move the needle for my issue (not surprising since we are talking about slightly different issues).

I guess I'll try to implement the archiving function on cloud functions in the coming days/ weeks. Still debating if I'll do it using a cron job or a simplistic http trigger from the clients if it finds doc that are 30 days old...

Anyways, thanks for the help and the details re. the cache's behavior!

wilhuff commented 5 years ago

I'm glad to hear that GC is working. What I was thinking would help your case is if you dialed down the size of the cache via the new option cacheSizeBytes you'd reduce the number of documents to scan automatically.

witbybit commented 5 years ago

We are facing similar issues in our app too. Fetching just 2 or 3 documents from a collection of more than 10,000 documents using a Firestore query is very slow. It has even taken more than 10 second sometimes. @fschaus Did archiving help with the speed?

ghost commented 5 years ago

Not really. Right now the only way to make querying doc in large collections work is to disable persistence... Which is really not great since persistence is a key feature but had to make that trade-off as the queries were becoming way too looooong.

@wilhuff Any update on when local indexing might make it to the iOS SDK? Thanks!

wilhuff commented 5 years ago

No update, unfortunately.

Have you tried setting cacheSizeBytes to something small (like a few MB)? If yes and this didn't help I'd like to know more.

ghost commented 5 years ago

Thanks @wilhuff ! I've been testing this on our staging build and so far it has worked remarkably well!

One issue we've noticed however, is that for users whose cache was previously large, setting the 'cacheSizeBytes' to something low does not immediately clear the caches and can leads to frustratingly long loading times.

Is there any way we can force the cache to clear? Right now the only we have is to delete the app and reinstall which is obviously not the best way. Can you also confirm that the garbage collection will work even if it takes a few days? Or could there be situations in which the cache never truly clears?

Thanks again for your help on this!

AchidFarooq commented 5 years ago

I have been reading into every issue available on the internet and sorry to say but the problem still exists untill now. We are using FireStore since oktober 2017 in our iOS app, but sad to say we regret it. One of the key features of firestore has to be the capability to get a large collection of documents in a fair amount of time. Now we even have to disable the persistence, and even then its still slow.

Adding a snapshot listener to a collection with 4000 documents returns data in appr. 9 till 30 seconds.

In the beginning when we started using Firestore for iOS our reassurance was that it was still in Beta. But now this problems still exist. Does anybody have a solution yet?

Here is an example, we are trying to get 3 documents from a 4000 document collection with the same query from a iOS app in Firestore.

** Call with addSnapShotListener - return time -> 9.354734063148499 s. ** Call with getDocuments - return time -> 9.92848801612854 s.

Offline: ** OFFLINE: Call with addSnapShotListener - return time -> 9.441628098487854 s. ** OFFLINE: Call with getDocuments - return time -> 10.107746958732605 s.

Our internet speed is an average of 340 mbps.

heumn commented 5 years ago

Just in case anyone else is looking for "How can I at least reduce the query time?" Disabling persistency reduced my fetching (on very large collections) from 45-55 seconds down to 10-20 seconds. Also added my own caching on top of if it to make it bearable for users.

EDIT:

I have to note that I am was for 3000 documents here, thus a very slow query is expected. Just not as slow as 55 seconds. As this is beta phase of our app I am doing adjustments to how this is fetched and stored in the future (doing the math on these documents server side). I have strong beliefs that the firestore team will deliver better on both query speed and better local cache in the future so I can absolutely live with this for now.

AchidFarooq commented 5 years ago

Thanks @heumn, we also experience this. We have send a new build to our users with persistence off and indeed added a cache. The only problem left is how to tell our users that the app is not usable in offline mode even though we sold it with a offline feature to them.

Also I have had some contact with the firebase support team via email. and they suggested to check in which country our current firestore server is located. When we started the project the only possibility was US (we are located in EU, Amsterdam). He suggested to change to the Frankfurt server and test if that makes a difference.

We will test that and check if that makes things better. WIll let you guys know.

I must say that the google response team is very fast in responding and very understanding. Makes things a little bearable. ;)

witbybit commented 5 years ago

@AchidFarooq Did changing the region help? Has anyone found a good solution to this problem apart from disabling persistence? Since we want our app to work offline smoothly, we will have to write our own persistence caching logic for sqlite which isn't ideal.

AchidFarooq commented 5 years ago

@nikhilag Sorry, i forgot to write a message here. But no it did not work. We have seen no improvements when changing our region from US to the EU (Frankfurt). The speed issue was the same when we run our test on large collections. We both used addSnapShotListener and getDocuments call. At the moment we get a lot of complaints from our users about the speed. But we have tried everything. So we're just waiting for the Firestore team to come up with a magic solution or an idea on how to work around this issue.

ghost commented 5 years ago

Same here could not get it to work and limiting the cashbytesize actually introduced some very unstable querying from the cache - for ex., it would query 5 docs out of the 25 to be displayed from the cache, then the rest from the server, either creating very weird UI bugs (lots of flickering, unstable UI) or some weird race conditions.

I "solved" the issue by limiting the # of docs queried and displayed at any given time, using the great pagination tools Firestore provides (basically an infinite scroll view that fetches the next 30 docs whenever needed). The firebase community team made some great videos on how to implement pagination and I'd be happy to share some code snippets on how to handle from the UI side (basically the TableView prefetching).

Good luck!

AchidFarooq commented 5 years ago

Hi guys, here are some new test results, maybe this can help you guys. I suggest you turn the FireStore logging on, this helps a lot to see what happens in the background with Firestore

The biggest thing we found is that if you do exactly the same query but then with Persistence DISABLED firestore is super fast.

iOS (do this before the init of firestore FirebaseApp.configure()): FirebaseConfiguration.shared.setLoggerLevel(.max)

Test parameters

11.000 documents collection
every query tries to get 1 document from this collection
We use location Frankfurt as our firestore location

Results of the test:

TEST 1
Persistence ENABLED
Method used: getDocuments()

RESULTS:
******CACHE return time -> 21.99819004535675s. - Nr of documents -> 1
******SERVER return time -> 48.675917983055115s. - Nr of documents -> 1

TEST 2
Persistence ENABLED
Method used: addSnapShotListener()

RESULTS:
******local cache return time -> 23.558130025863647s. - Nr of documents -> 1
******server return time -> 25.278698086738586s. - Nr of documents -> 1

TEST 3
Persistence DISABLED
Method used: getDocuments()

RESULTS:
******CACHE return time -> 0.0060280561447143555s. - Nr of documents -> 0
******SERVER return time -> 0.18609201908111572s. - Nr of documents -> 1

TEST 3
Persistence DISABLED
Method used: addSnapShotListener()

RESULTS:
******server return time -> 0.15342891216278076s. - Nr of documents -> 1

mikelehen commented 5 years ago

@AchidFarooq Thanks for the added info! Unfortunately this is probably expected at present. The problem is that the SDK doesn't implement client-side indexing and so when you perform a query and have lots of documents cached locally for the collection in question, the client has to read each document in the collection to determine if it matches the query. This takes a long time and the time taken is proportional to the number of documents in the cache.

The suggested workaround until we're able to tackle the client-side indexing feature is to keep the cache size low or turn persistence off entirely (depending on app requirements).

witbybit commented 5 years ago

Similar to the get() call, can we please add an option for addSnapshotListener() to fetch only from server and skip cache?

AchidFarooq commented 5 years ago

@mikelehen Thanks for your response, the tests were just to clear things up for us and maybe for everybody else who has the same problem. Now i do have a better understanding of what happens. We hope you find a way to tackle the problem, for now we are using your workaround. Thanks, good luck!

marcglasberg commented 5 years ago

@mikelehen You say a solution is to keep the cache size low. However, the allowed minimum is 1Mb. Could you please at least fix this by letting us define the cache to be really small, say, 50kb?

We can't really turn the cache off, because it then reads all documents in the snapshot and charges for them. Say the cache is off and I have a listener which will get 100 messages per day. I'll pay 1 read when the first message arrives, 2 reads when the second message arrives (since it will read both messages), 3 reads for the third and so on. After 100 messages I will have paid 1+2+3+4+...+100=5050 reads, instead of just 100 reads if the cache is on.

In this situation I need the cache to hold 100 messages only. But if each message is 500 bytes, then it fits 2000 messages in 1Mb, and the queries will be slow already.

yamauchieduardo commented 5 years ago

I've opened an issue to reduce the cache size: https://github.com/flutter/flutter/issues/35648

witbybit commented 5 years ago

On Android, I ended up writing the offline workflow myself because of poor performance of firestore's offline workflow. Since I know my schema, I can do a much better job of storing the data in sqlite. It took a couple of weeks but it was worth it. Now firestore offline workflow is disabled and my queries are much faster for fetching certain data which I never needed offline anyway. I also have total control over what data I always want available offline instead of relying on the sdk which was caching ALL the data and at times throwing out the useful data from cache. Obviously there are some cons to this approach where I had to write my own workflow for saving data offline but it wasn't too bad. I think offline support is a hard problem for firebase team to solve in a general way such that it works for everyone. Maybe in future they will introduce support for indexed data and more control over what queries should work offline.

mikelehen commented 5 years ago

@marcglasberg I don't know of any situation where you should encounter 1+2+3+4+...+100=5050 reads for a single listener. If you're seeing something like that, can you open a new issue with details?

In general, when you start a listener, you are charged 1 read for each result that matches the listener, and then an additional read every time a new item matches the listener. But you shouldn't be re-charged for existing items.

marcglasberg commented 5 years ago

@mikelehen According to Doug Stevenson in this StackOverflow answer: https://stackoverflow.com/questions/48489895/firestore-query-reads-all-the-documents-again-from-database-when-only-one-is-m/48491561#48491561

When you use onSnapshot() to request a snapshot of a collection (or query), any change to a document will cause the entire snapshot to be sent to your callback. (...) If any of the documents in that snapshot could come from the local cache, that's the data you will receive, and it will not count as a read. (...) So, in your case, if you do not have a local cache enabled, you may be charged for a read on all the returned documents at every change.

So according to him: "you may be charged for a read on ALL the returned documents at EVERY change."

In other words, if I have the cache DISABLED, then if the first query returns 1 document I will pay 1 read, yes. But as soon as there is a second document that matches the query, the query will return both the first and the second document, and I will pay for both of them (I will pay for the first document again). Note: The cache is off, and Firebase doesn't know the first item is an "existing one".

wilhuff commented 5 years ago

This summary is incorrect.

Our pricing documentation is here: https://firebase.google.com/docs/firestore/pricing#operations. This is what it has to say about how you're charged:

When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed. (In contrast, when a document is deleted, you are not charged for a read.)

Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.

That is to say, that generally you're charged for the changes to a snapshot, not the number of documents in it.

If you have a query result containing 100 documents and then you update one document in the result 100 times, you'll be charged 100 reads to initially populate the result set and then 1 read for each update for a total of 200 reads. We don't charge for the 99 documents that didn't change at each update after you start listening.

The second part comes into play when you're using the offline cache. If you resume a query within 30 minutes we don't charge to for the initial 100 documents. Those are pulled from the offline cache and we only charge for what has changed since you stopped listening.

I'll poke Doug to correct his answer.

marcglasberg commented 5 years ago

@wilhuff You say "The second part comes into play when you're using the offline cache." and "Those are pulled from the offline cache".

But my point (and Doug's point) is what happens when you have the offline cache DISABLED.

Are you saying that you only charge for the changes to a snapshot, EVEN WHEN THE OFFLINE CACHE IS DISABLED?

wilhuff commented 5 years ago

Correct. While you're listening to a query we only charge for the documents that actually changed, not for the total number of documents that are in the snapshot.

marcglasberg commented 5 years ago

Ok, thanks. That's good news in fact.

The documentation is not clear enough, which leads to a lot of questions in StackOverflow. Even if Doug corrects his answer, I would suggest the documentation must be improved a lot, so that people stop asking for clarifications there.

audkar commented 5 years ago

The second part comes into play when you're using the offline cache. If you resume a query within 30 minutes we don't charge to for the initial 100 documents. Those are pulled from the offline cache and we only charge for what has changed since you stopped listening.

In real world scenarios user doesn't keep his messages application open all the time and doesn't wait for all 500 messages to come. He open, reply, close app. Snapshot listeners will change. And this second part will kick in very often.

Heck, users even in same session often is going back and forth between screens.

AchidFarooq commented 5 years ago

I thought on giving you guys maybe an update on things right now in our app. We started implementing much more direct calls to documentId's to solve the client side indexing at the moment.

Before, if we had an ID for instance, and we wanted to look up the document matching this key we would perform a query with the .whereFieldIs method. This was taking very long. Now we use much more relations inside documents to get the specific document. The only downside of this is that we still have the problem when we need to get multple documents. But with this we enabled the persistence again so thats also a plus side.

LCostaN commented 5 years ago

Is this issue still open? I've been trying to reproduce it to make sure, but I haven't succeeded yet. Did it actually get fixed and someone forgot to close this? Or have I misunderstood the real problem here?

wilhuff commented 5 years ago

The problem described by this issue is that very large single collections negatively impact the client's query performance. This is still true. It's also something we're working on addressing which is why the issue is still open.

Beeen commented 5 years ago

At app start we need to load multiple profiles by their ID resulting in multiple calls to addSnapshotListener() which is very slow to get all profiles needed. It seems to create a burst that we didn't experience with RealTime Database (profiles were in RTDB back then) Is our problem related to this issue or does it concerns only querying collections with sorting and limiting ? Are we doing it wrong ?

var-const commented 5 years ago

@Beeen The issue should only manifest on relatively large collections (at least a few hundred documents, or containing very large documents). Could you please provide more details on what you're doing (ideally with code examples)? In particular:

how large are the collections/documents you're loading (both in terms of the number of documents and the size of the documents);
how long are the delays you're experiencing?
can you provide a step-by-step outline of your code? If you could just post some sample code (simplified if necessary), that would be even better.

iwasrobbed-ks commented 4 years ago

There are some concerning architectural issues if we can't paginate 50 at a time of 1,000 total records of something (photos, messages, restaurants, ). You don't really need to be "at scale" because 1,000 isn't that many of something to be querying through even for small apps 🙁

This seems to go beyond just indexing alone; creating a "large" (> 1k) collection and then deleting them just causes the querying to break and never respond without reinstalling.

Local cache seems easily broken by common CRUD actions, regardless of if the client is online or not, and never seems to correct itself on subsequent syncs with the server. Unfortunately, local cache is not just an object store but rather the frontend of the querying capability, so when this breaks, the entire app grinds to a halt.

marcglasberg commented 4 years ago

@wilhuff @mikelehen In 4 months we'll be launching an app based in Firebase (a 2-year effort), and it's worrisome to read comments like the previous one. Could you please comment on "Local cache seems easily broken by common CRUD actions" and that the cache can't heal itself? As of now, should we trust Firestore with the future of our company? If our app grinds to a halt it's going to be the end of us.

mikelehen commented 4 years ago

@rob-keepsafe If you're seeing requests fail to ever finish or return incorrect results (which I think is what you are saying when you refer to "querying to break" and "local cache seems easily broken ... and never seems to correct itself"), please open new issues with the details on what you're seeing so that we can investigate. That doesn't sound expected.

This github issue is specifically tracking the fact that very large single collections in the cache negatively impact the client's query performance. But the queries should still always complete with correct results.

The good news is we have made some progress on query performance. We have a change that should be released for Android very soon and later for JavaScript and iOS which should make queries that you've executed previously return results significantly faster, even if the cached collection is very large. We'll update this thread once it's available and would welcome feedback.

@marcglasberg We have thousands of production apps happily running on Firebase and Cloud Firestore. In general, the best thing to do is just to test your app thoroughly before releasing, perhaps being mindful of any operations you can do in the app that might increase the cache size (e.g. you may not want to expose functionality to download a large collection for offline viewing, since this would slow performance for all queries against it).

heumn commented 4 years ago

Here I have to support @mikelehen. Although a query for lots of data on huge collections isn't ideal and the firebase local cache is less than optional. Firebase itself is a really powerful service and I would never have been able to build the products I build if it didn't exist. I have never seen errors like those @rob-keepsafe is reporting.

deleted rant comment

iwasrobbed-ks commented 4 years ago

@marcglasberg I would just recommend creating a minimum viable performance testing app that runs through your typical user stories and tests where the performance degrades. For us, it unfortunately happened very quickly (a user has >1,000 items that can be simply sorted), even with pagination. This is because all querying/sorting/filtering is done in-memory as pointed out above.

If it works for you now and continues to work at the collection size you're seeing at scale, excellent 👍

@heumn Same for you, glad it's working 👍 During our initial performance tests, it didn't. I'd rather people perform their own tests based upon their unique schema and user stories rather than blindly trust and build upon it for years only to hit issues after they launch. As you can see from this thread, a number of others are "rant"ing as well, and seeing the same 40 to 50 second delays we saw ¯_(ツ)_/¯

The collection wasn't "huge" either; all the sample apps show a restaurants search app, so you can easily imagine trying to query 1,000 restaurants in a city and hit the same performance issues. We were under the impression it would "just work" based off all those examples, but were disappointed when it didn't.

heumn commented 4 years ago

@rob-keepsafe I am one of those complaining 😅

Slow queries for large amount of data is stuff you can easily work around. Same with the cache. Sure, it isn’t optional. I just felt you went a bit over the top. Sorry for calling it a rant. I can relate to the frustration.

Viktoru commented 4 years ago

I have one million documents. Any suggestion you have to return one value from a collection documents? Each document has 3 values, numeric values.

Somebody said: no matter how many documents are in a collection, the queries will take the same amount of time. Say you add 130 document and (for sake of example) it takes 1 second to get 10 documents out of it. That's the performance you'll get no matter how many documents are in the collection. So with 1300 documents it will also take 1 second, with 13K it will take 1 second, and with 13M, it will also take 1 second.

jamesdixon commented 3 years ago

@mikelehen was the fix you mentioned in https://github.com/firebase/firebase-ios-sdk/issues/1868#issuecomment-552496117 ever released?

We're using the Flutter SDK and we have LOTS of documents and running into performance issues with the local cache. Local indexing as well as more control over the local store would probably solve a lot of our issues. In addition to the performance issues, we can't selectively delete items from the store so everything stays in place making things slower and slower. I'd love to be able to call something on the SDK to delete anything that has already been synced.

samtstern commented 3 years ago

@jamesdixon @mikelehen is no longer on the team but @schmidt-sebastian and others have made many offline performance improvements since that comment was posted.

You could see some benefit from reducing the maximum Firestore cache size: https://firebase.google.com/docs/firestore/manage-data/enable-offline#configure_cache_size

If you have specific queries / benchmarks to share about the slow performance you're seeing that would be helpful!

jamesdixon commented 3 years ago

@samtstern thanks for the reply!

From what I've read, the cache being slow is a known issue across the board. I'm happy to share more information about our use case but the gist of it as that we can have thousands of documents in the cache and reading those documents from the cache takes a non-trivial amount of time.

That said, a few questions:

I've seen other threads that mention indexing fields on the cache is something planned/being worked on (?). Can anyone provide an update? I'm sure there are challenges but I have to think that indexes on the cache could greatly speed up access time compared to the current method of loading everything into memory and filtering those results.

You mentioned reducing the cache size. I have a bunch of questions on this specifically...

Can you tell me more about the strategy for purging documents from the cache?
Would documents that have not yet been synced be purged?
Is there a certain amount of time that has to pass where a document has not been accessed before it's purged? If so, what is that time?
Is the cache size a hard limit or is it flexible? Meaning if I have many documents that match the criteria for not being purged and the total size of those documents exceed the cache size, what happens? Can the cache size "flex" or do you start purging documents until that set cache size is reached?

Also, I'd like to make a recommendation and maybe this already exists, but from what I've seen, these limitations of Firestore, specifically regarding persistence, are not well-documented. It wasn't till after we ran into these issues and I started investigating that we uncovered mentions of Firestore having these problems and not being designed for handling large amounts of data locally.

I ended up seeing official mention of this in this article and this article on the Firebase blog. I think it's important to surface these limitations in a more visible fashion so other developers can avoid some of these pitfalls before getting too deep into Firestore.

cc: @schmidt-sebastian

Thank you for hearing me out!

schmidt-sebastian commented 3 years ago

We are actively working on adding indexing to the SDKs, but this is a multi-month effort for each of our platforms.

jamesdixon commented 3 years ago

Thanks. I won't ask for an ETA but could you possibly answer my other requests regarding the cache and how it behaves?

schmidt-sebastian commented 3 years ago

Documents are purged from cache in a background task. This task is run once after 1 minute and then once ever 5 minutes. The whole process is based on the size of the LevelDB file.

If the file size of the LevelDB cache exceeds the cache threshold, we go through all queries by their last access time. Documents that are only associated with the last accessed queries are purged first. All documents in a query are purged as a group, which ensures that a re-run query either has all documents from the last run or none (unless another query has an overlapping result set). We delete up to 10% of documents per garbage collection run, but we never delete a document that is part of an active query.

Pending mutations count against the cache size, but are not deleted as this may cause data loss. They are removed once acknowledged by the backend.

The cache size is not a hard limit. The size may fluctuate between garbage collection runs and is likely to exceed the specified thresholds in between.

firebase / firebase-ios-sdk