firebase / firebase-android-sdk

Firebase Android SDK
https://firebase.google.com
Apache License 2.0
2.23k stars 565 forks source link

Firestore query snapshot with changes only #5965

Closed bswhite1 closed 1 month ago

bswhite1 commented 1 month ago

What feature would you like to see?

The current querySnapsot returns the list of all documents that match the query every time that a change occurs. For streams that return a large number of documents this an be an issue, as well as inefficient. This would mimic how the backend firebase talks to the cloud.

For instance, our use case requires us to keep a local cache of all documents returned by a query. When a single document is updated, querySnapshot will send a list all documents matching the query along with a list of the documents that were changed. All we need is that list of changes to update our local cache.

Here is an Flutterfire PR with requested changes: https://github.com/firebase/flutterfire/pull/11539 They requested that either the firebase-android-sdk make the change or the main Firebase. I submitted a main Firebase request, and they asked me send it here.

This change is VERY important to us, since we are running a custom Flutterfire fork in order to get the functionality. I am assume that every language could benefit from these changes.

Looking to create a 'QuerySnapshotChanges' or 'QueryChanges' based on: com.google.firebase.firestore.QuerySnapshot

How would you use it?

We listen only to changes, not the entire matching list of documents.

google-oss-bot commented 1 month ago

I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.

wu-hui commented 1 month ago

Note that the full document set is from SDK cache + changed docs sent by backend. We are not transmitting the entire result set for every change happened to your query.

You do end up with higher memory usage because of this, but we have to do this because we would not know what ranks of the changed documents are in the order set by the query.

bswhite1 commented 1 month ago

I am not sure why this feature request was closed. It would be a new feature and could put comments on it's usage.

In my use case, I do not care about order as I a keep my own cache and sort them as needed.

If sorting was required why exactly can't that be determined? Doesn't the backend have to do that now when it gets a change from the cloud?

frozenfrank commented 1 week ago

I also would benefit from this feature. I know that Google Firestore does a lot of awesome caching for me automatically, but I have a scenario that requires me to do some of my own caching in addition to that provided out-of-the-box. Having access to the raw stream of new documents would allow me to invalidate my caches as the changed data arrives, rather than having to run deep-change detection on the entire set of new documents each time.

Specifically, I am producing some changes locally and sending them to a CloudFunction for verification. In the meantime, I want to update my app with the new changes, but the Firestore cache won't handle that since it will quickly reject (before my CloudFunction finishes running) because the security rules failed.

If there was any way for me to access the raw set of fresh documents from the database before they are merged in with the cached documents, it would be an effective solution. Currently, it seems there is no way to do that.

(Here I'm discussing this in the context of the web client library.)

frozenfrank commented 1 week ago

@bswhite1 Are you concerned about getting the list of new changes, or something bigger than that?

I just noticed that the QuerySnapshot result has a field called docChanges() that reports only the documents that have changed since the last snapshot. See the specs for flutter; they also exist for JS in my case, and I assume for all other libraries.

Notably, this docChanges() is separate from the docs() method which returns all of the docs. If you are concerned about the memory requirements, it seems you can avoid calling the docs() method, and it won't generate the data. [The client library keeps a cache of all returned documents so it seems unlikely that you'll be able to avoid having all watched documents stored somewhere in your application.] If you are that concerned about memory usage, "watching" for changes probably isn't your best approach.

What kind of scenario requires watching for changes to a large number of documents that also imposes strict memory requirements? Have you considered Firestore Cloud Triggers to directly respond to document changes?

frozenfrank commented 1 week ago

I now see why this issue is closed as completed.