Open abarone-btf opened 3 months ago
Hi @abarone-btf. Thank you for your question. While I look into this, could you try different permutations of your query to see which clause/clauses seem to be the culprit?
Here are some permutations to try, and report back which ones result in the UNAVAILABLE
error and which ones succeed:
Thank you.
Hi @dconeybe !, of course, this are the results for each of the steps (I am applying this individually, starting each remove from the base query described above, not a permutation of removes):
UNAVAILABLE
error ❌UNAVAILABLE
error ❌createdAt
condition (keeping the OR of course): Needed a new index, after creation the result is UNAVAILABLE
error ❌One thing I tried that "works" is applying an order by deletedAt
first (asc
), then by createdAt
, the query runs smoothly without problems (maybe because the data has many null
values in that field, in contrast createdAt
has a lot more cardinality), also in that example the plan summary uses a single index, in contrast with the base query that uses 2 "almost identical" indexes.
In my case that re orderding does not work because I might miss some entries in the pagination flow, but its worth mention it.
Thanks again for your help.
Would you be willing to provide your Firebase project number so we can check the backend logs?
You can find your project number in the Firebase Console (screenshot)
Of course, this is our project number 783579267042
Thank you @abarone-btf. The backend engineers have investigated and found a potential query planner bug in the backend (Googlers see b/346321260). I'll keep you posted on developments as investigation continues.
One question though... you mentioned in the OP that
the query is searching over an Id that its repeated on my database with aprox 27k items
When you say "aprox 27k items" is "27k" referring to the number of documents that match the query, or the total number of documents in the collection? If the former, can you give an estimate on the total number of documents in the collection?
Awesome @dconeybe!!, yes the 27k elements means that there are aprox 27k documents with myId = 123
(for the record, when this number is "low" the query ends without problem)
Do you know, roughly, how many documents are in the entire collection (i.e. with any value fo myId
)?
Yes!, in the collection there are aprox 200k elements
One question, what does it mean "Googlers see b/346321260"?
@abarone-btf Could you please open a support ticket at https://firebase.google.com/support/troubleshooter/contact for this issue? We need an official support ticket to provide justification to look at your project in more detail. Be sure to include the URL of this github issue in your support ticket and also mention that b/346321260 as the internal ticket that has been logged for this issue.
One more request: Could you try removing the Filter.where('deletedAt', '==', null)
clause and see if the query runs to completion?
"Googlers see b/346321260" is a note for Google engineers, mentioning the internal ticket that is tracking this issue, b/346321260.
Awesome @dconeybe !, I created the ticket (screenshot below).
In relation to the test, without Filter.where('deletedAt', '==', null)
the query returns sucessfully and without Filter.where('deletedAt', '>=', new Date()),
(leaving the == null) it also returns sucessfully.
Thanks again!
@abarone-btf could you let us know the case number (if any)? Also, if you are comfortable, it would help if you can provide the un-redacted query in the support case.
While we wait for the support ticket to be created and surfaced, can you also help us populate the table below by testing on your end (also correct me if I record the behavior incorrectly)?
query | behavior |
---|---|
where('myId', '==', 123).where('createdAt', '<=', new Date()).where(or(where('deletedAt', '>=', new Date()), where('deletedAt', '==', null))).orderBy('createdAt', DESC).limit(500) | timeout |
where('myId', '==', 123).where(or(where('deletedAt', '>=', new Date()), where('deletedAt', '==', null))).orderBy('createdAt', DESC).limit(500) | timeout |
where('myId', '==', 123).where('createdAt', '<=', new Date()).where('deletedAt', '>=', new Date()).orderBy('createdAt', DESC).limit(500) | succeed |
where('myId', '==', 123).where('createdAt', '<=', new Date()).where('deletedAt', '==', NULL).orderBy('createdAt', DESC).limit(500) | succeed |
where('myId', '==', 123).where('createdAt', '<=', new Date()).where('deletedAt', '==', NULL).orderBy('createdAt', DESC).orderBy('deletedAt', DESC).limit(500) | ? |
where('myId', '==', 123).where(or(where('deletedAt', '>=', new Date()), where('deletedAt', '==', null))).orderBy('deletedAt', DESC).limit(500) | ? |
Hi @xwkuang5 !, of course the ticket numer is 10287347
, about the un-redacted query, how do I send that?, as a reply on the mail?.
Related to the table, both at the end works good. Also there is one missing on the base query: Without orderBy: Needed a new index, after creation the result is UNAVAILABLE error ❌
.
Hi @abarone-btf , could you please help us confirm that by running the following queries, if the number of results returned on your end matches below? These are the numbers we observed from our logs. | Query | Num of results |
---|---|---|
where('myId', '==', 123).where('createdAt', '<=', new Date()).where('deletedAt', '>=', new Date()).orderBy('createdAt', DESC).limit(500) | 0 | |
where('myId', '==', 123).where('createdAt', '<=', new Date()).where('deletedAt', '==', NULL).orderBy('createdAt', DESC).limit(500) | 500 |
Hi!, @clairekeer1997 . Yes, thats exactly the result!
@abarone-btf, could you enable debug logging and re-run the problematic query from your opening comment? The debug logs will help the investigation. If you don't feel comfortable sharing the logs publicly here in the github issue, feel free to add them to the support ticket, where only Google will have visibility. If you do feel comfortable sharing them publicly, please create a gist at https://gist.github.com/ and share the link to the gist here.
To enable debug logging, call
setLogFunction(console.log);
https://googleapis.dev/nodejs/firestore/latest/global.html#setLogFunction
Hi @dconeybe of course!, but how do I add them to the support ticket?
@dconeybe I read that replying to the mail is the way to go!. I reply to it with the gist url 👌 .
If you can please confirm that you receive it that would be awesome.
Thanks!
We have received the debug logs and have been able to reproduce the issue. This is great news because it means we can investigate and test out fixes. We will update on the progress by commenting on this GitHub issue.
Awesome @dconeybe !!. Is there some light at the end of the tunnel? 👀 jajaja.
Anything I can help you with, please tell me.
Thanks!
Yes, a fix has been submitted and we need to roll it out to the backend servers. The ETA for the fix to actually be fully rolled out is 2-4 weeks. Thanks again for all your help diagnosing this!
Update: Unfortunately, rollouts have not gone as smoothly as hoped. It will likely be another 2-4 weeks (some time in August 2024) when the fix will finally roll out.
dconeybe: I appreciate your support for this issue and consistently following up.
Is the issue in this thread caused by the same bug that causes the issue mentioned here: https://stackoverflow.com/questions/78605344/how-to-fix-grpc-channel-multithreadedrendezvous-fails-statuscode-unavailable ?
Is the issue in this thread caused by the same bug that causes the issue mentioned here: https://stackoverflow.com/questions/78605344/how-to-fix-grpc-channel-multithreadedrendezvous-fails-statuscode-unavailable ?
I don't think that stackoverflow post is the same issue. The fix for this issue is related to how "or" queries are planned and executed, but that stackoverflow post doesn't seem to be performing an "or" query. I'd suggest logging an issue against https://github.com/googleapis/python-firestore/ since that looks like a python-specific bug (the team that maintains nodejs-firestore is different than the team that maintains python-firestore). It would be interesting, though, if using the CollectionReference.stream()
method of nodejs-firestore works in this specific case that is problematic with python. If you can reproduce the same issue in node then feel free to open a new issue against this github repository for it.
@abarone-btf the fix has been rolled out for some time now, can you confirm if the issue has been fixed for you? Thank you!
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
1) Is this a client library issue or a product issue? Yes its the client library
2) Did someone already solve this?
3) Do you have a support contract? No
Environment details
@google-cloud/firestore
version: 7.6.0Steps to reproduce
This is an example of the query:
14 UNAVAILABLE: Query timed out. Please try either limiting the entities scanned, or run with an updated index configuration.
.What may be happening here?, maybe am missing some weird index?, I tried to run the explain function but with the
analyze
flag the result is the same, without it, the plan summary its this one:Thanks in advance!