firebase / firebase-android-sdk

Firebase Android SDK
https://firebase.google.com
Apache License 2.0
2.26k stars 573 forks source link

[Firestore] INTERNAL ASSERTION FAILED: Encoded field value should not be null. #106

Closed natario1 closed 5 years ago

natario1 commented 5 years ago

Environment

Problem

Occasionally, when creating a new object, I get this:

    java.lang.RuntimeException: Internal error in Firestore (0.6.6-dev).
        at com.google.firebase.firestore.obfuscated.zzgf.zzb(com.google.firebase:firebase-firestore@@17.1.1:377)
        at com.google.firebase.firestore.obfuscated.zzgk.run(Unknown Source:2)
        at android.os.Handler.handleCallback(Handler.java:790)
        at android.os.Handler.dispatchMessage(Handler.java:99)
        at android.os.Looper.loop(Looper.java:164)
        at android.app.ActivityThread.main(ActivityThread.java:6494)
        at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:438)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:807)
     Caused by: java.lang.AssertionError: INTERNAL ASSERTION FAILED: Encoded field value should not be null.
        at com.google.firebase.firestore.obfuscated.zzge.zza(com.google.firebase:firebase-firestore@@17.1.1:3046)
        at com.google.firebase.firestore.obfuscated.zzge.zza(com.google.firebase:firebase-firestore@@17.1.1:3031)
        at com.google.firebase.firestore.obfuscated.zzfu.zza(com.google.firebase:firebase-firestore@@17.1.1:262)
        at com.google.firebase.firestore.obfuscated.zzfu.zza(com.google.firebase:firebase-firestore@@17.1.1:957)
        at com.google.firebase.firestore.obfuscated.zzfu.zzb(com.google.firebase:firebase-firestore@@17.1.1:727)
        at com.google.firebase.firestore.obfuscated.zzbf.zza(com.google.firebase:firebase-firestore@@17.1.1:166)
        at com.google.firebase.firestore.obfuscated.zzcy.zzc(com.google.firebase:firebase-firestore@@17.1.1:103)
        at com.google.firebase.firestore.obfuscated.zzcy.zza(com.google.firebase:firebase-firestore@@17.1.1:142)
        at com.google.firebase.firestore.obfuscated.zzbg.zza(com.google.firebase:firebase-firestore@@17.1.1:524)
        at com.google.firebase.firestore.obfuscated.zzbo.run(Unknown Source:6)
        at com.google.firebase.firestore.obfuscated.zzcw.zza(com.google.firebase:firebase-firestore@@17.1.1:150)
        at com.google.firebase.firestore.obfuscated.zzbg.zza(com.google.firebase:firebase-firestore@@17.1.1:514)
        at com.google.firebase.firestore.obfuscated.zzak.zza(com.google.firebase:firebase-firestore@@17.1.1:173)
        at com.google.firebase.firestore.obfuscated.zzk.zza(com.google.firebase:firebase-firestore@@17.1.1:91)
        at com.google.firebase.firestore.obfuscated.zzm.zzb(com.google.firebase:firebase-firestore@@17.1.1:137)
        at com.google.firebase.firestore.obfuscated.zzw.run(Unknown Source:4)
        at com.google.firebase.firestore.obfuscated.zzgf.zzc(com.google.firebase:firebase-firestore@@17.1.1:309)
        at com.google.firebase.firestore.obfuscated.zzgj.call(Unknown Source:2)
        at com.google.firebase.firestore.obfuscated.zzgf.zza(com.google.firebase:firebase-firestore@@17.1.1:285)
        at com.google.firebase.firestore.obfuscated.zzgi.run(Unknown Source:4)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:457)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:301)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
        at com.google.firebase.firestore.obfuscated.zzgf$zza.run(com.google.firebase:firebase-firestore@@17.1.1:203)

Due to the codebase it is very hard for me to create a small reproducible sample. What I am doing is creating a new document, while I have a real time query listener. The document is created, so I assume the app crashes when the query listener is notified about it.

I will investigate more, but I could be more helpful if the owners could point me to the right direction. What is this Encoded field value? What could be the issue?

rsgowman commented 5 years ago

Firestore encodes/serializes data when either sending it over the network to the backend servers, or when sending it to the local disk. The field values themselves are usually just the document contents, as set by (eg) DocumentReference.set().

So, naively, this could be a Document with data that has somehow become corrupt and failing to serialize before being sent out to the network/disk.

But looking at the stack trace (which, unfortunately, is obfuscated), I can see that this is actually occurring while attempting to encode/serialize the query itself to the local disk. In particular, it looks to be the query's 'endAt/endBefore' field that is causing the exception.

So take a look through your code, and see if you call endAt or endBefore anywhere. If so, take a look at the parameter(s) to those functions and see what the values are. My suspicion, is that the parameters to these functions are somehow being encoded improperly... but I'm not immediately able to come up with something that would cause this. If you can come up with a minimal repro of this problem, I can investigate further.

If you do figure out what's causing the problem, please let us know! Even if you're somehow passing invalid data, we should be able to catch that earlier and issue a better error message than this.

natario1 commented 5 years ago

Thanks! I will investigate and get back to you

natario1 commented 5 years ago

@rsgowman I can tell you that the only endAt / endBefore call is with a non-null DocumentSnapshot (the object that was just created).

I have not debugged, but my guess is that:

If I remember correctly, the second query will immediately receive a snapshot whose createdAt is null (because we didn't get the server response yet). And I don't know why, but this might be causing the crash.

If you have suggestions let me know.

natario1 commented 5 years ago

So in my specific setup this happens when the collection is initially empty. I create the document with serverTimestamp. The first query listener is notified a of a new list (size = 1). It takes the last element and creates a new query where this last element is used as endAt (this way the query #2 will catch all new values).

Won't bother you with respect to why I do this, but if my understanding is correct, the crash is caused by something like: db.collection("collectionWithSize1").endAt(snapshotWithNullCreatedAt).orderBy("createdAt", DESCENDING).

Edit: you can reproduce with these lines:

        val STORE = FirebaseFirestore.getInstance()
        val COLLECTION = "test"
        val FIELD = "testField"
        STORE.collection(COLLECTION).addSnapshotListener { snapshot, _ ->
            snapshot?.documents?.lastOrNull()?.let {
                STORE.collection(COLLECTION).orderBy(FIELD).endAt(it).addSnapshotListener { _, _ ->
                    // Won't get here. Crashes before.
                }
            }
        }
        val document = mapOf(FIELD to FieldValue.serverTimestamp())
        STORE.collection(COLLECTION).document().set(document)
rsgowman commented 5 years ago

I have not debugged, but my guess is that:

  • I create a document, setting the field createdAt to FieldValue.serverTimestamp()
  • A real time query listener is notified about it. It takes this snapshot and uses it as the endAt for a second query
  • This second query is ordered by the field createdAt

If I remember correctly, the second query will immediately receive a snapshot whose createdAt is null (because we didn't get the server response yet). And I don't know why, but this might be causing the crash.

Yes, this is exactly the problem. Your reproduction illustrates this nicely, and I've got this working (i.e. failing) on this end now. The problem is that we're attempting to orderBy a field with a timestamp that hasn't been resolved by the server just yet. We should be sorting documents in this state last rather than crashing. I'll see if I can get a fix together for this.

rsgowman commented 5 years ago

This is slightly more subtle than just fixing locally. In addition to properly ordering locally, we also need to send something to the backend so it can properly process the query and return all the results. We could send the current local time, but this could be incorrect, either by a little or a lot, so would result in an incorrect query. We could send 'infinity' (since the doc was just created, the timestamp should be the largest value) but this runs into trouble if there are future values in the database.

Basically, this is a bit of an edge case, with no really great solution from our end. We can certainly make a better error message, but it's not entirely clear what else (if anything) we should be doing.

Would you be able to provide us more context? If you can tell us exactly what you're trying to do, we might be able to come up with something better.

Additionally, if you're looking for a solution now, you could wait until the document's timestamp has been resolved by the server before creating that second listener. Checkout https://firebase.google.com/docs/reference/android/com/google/firebase/firestore/QuerySnapshot.html#getMetadata() and https://firebase.google.com/docs/reference/android/com/google/firebase/firestore/SnapshotMetadata.html#hasPendingWrites(). If you only create the listener when hasPendingWrites is false, I think you'd avoid this case altogether.

natario1 commented 5 years ago

Yes, I have fixed this on my side by waiting for the next call.

The context is simply paginating data while keeping real time listeners for each page. The technique of using startAfter + limit is not safe for insertions (you lose data between pages) or deletions (duplicated data between pages).

What's safe is startAfter + endAt, but I can't know beforehand the time-range of the data in order to have pages of reasonable size (plus if a page comes empty, we can't say that the next will be empty as well...).

So what I am doing is register a first listener with startAfter + limit. When it gets valid results, it takes the last item and replaces itself with a new startAfter + endAt listener, & passes the last item to the next page so it can do the same.

This way each page safely observes the same amount of data. The drawback is that each page requires two network calls. If you have better ideas, please let me know 👍

It would be really nice if the backend could support something like query.endAtLimit(15). The behavior of this would be, use the 15th item as an endAt() reference (so, don't constrain this query to have 15 results).

wilhuff commented 5 years ago

We've discussed adding something very similar, solving the same problem. It's on our backlog, but unfortunately we have a ton other things to do in the run up to making Firestore GA.

mikelehen commented 5 years ago

As wilhuff mentions, the pagination scenario is on our radar. Sorry for the pain. :-(

Just to make sure we understand the steps that led to the crash, did you have to create your new document (with the server timestamp in it) and then immediately (before the write completes) attempt to scroll to the next page (to trigger the second query using the unresolved server timestamp in endAt())? So this manifests as a timing issue of sorts? (clicking "next page" immediately after "add doc" crashes, but if you wait a second, then it works)

natario1 commented 5 years ago

Not really @mikelehen , with my setup (which is the only working setup I could think of), the crash happens on the first page, intially empty: after creation, the query finally gets some results, so it replaces itself (limit=X) with another query (endAt=documents.last). But the last and only document is the one with pending writes.

If somehow this succeeded, the second page would be opened using this document as startAfter and crash too, assuming the field is still null.

I could further improve the whole thing by avoid launching the second query if documents.size < X, but overall, yes, this is more complex than it should be.

mikelehen commented 5 years ago

Ahh... I see (mostly, anyway)! I was having trouble figuring out why writing a document would trigger a new query, but it sounds like you have a bit of a special case to deal with "empty results" and writing a document is what gets you out of that condition and triggers the next pagination query. Thanks for explaining.

natario1 commented 5 years ago

See, it is complex! Thank you all, appreciate how you handle issues here.

rsgowman commented 5 years ago

FWIW, https://github.com/firebase/firebase-android-sdk/pull/138 "fixes" this such that an exception is thrown much earlier, and is more descriptive of the problem. I'll close this issue for now; feel free to re-open if there's something additional you feel should happen here.