Open thomasdao opened 7 months ago
Thanks for reporting @thomasdao. I'll try to reproduce it
@ehsannas thanks, I've invited you to the sample project :)
Thanks @thomasdao . I am able to see the error in the logs from your repo. I do, however, see that each such log message is followed by an UNAVAILABLE
code from the backend. Which means it's a legitimate error returned from the backend to the SDK. It's plausible that the newer WebChannel version has become much more efficient at sending parallel requests to the backend such that you're hitting a certain limit of request rate for a single client. This error code is retryable with a backoff, which means the SDK will recover and rerun the query after some delay.
Please take a look at: https://firebase.google.com/docs/firestore/real-time_queries_at_scale#understand_high_write_traffic_in_the_system https://firebase.google.com/docs/firestore/best-practices#ramping_up_traffic
@ehsannas I've never seen the UNAVAILABLE
code, even if I wait for more than 10 minutes.
I find the reason newer WebChannel version has become much more efficient at sending parallel requests
not really logical: the same type of query works with version 10.6.0, which indicates that the server is able to handle that query and the problem is likely with the newer version of the client.
I've tested adding a delay of 1 second between each paginated query to reduce server load, and see the same error @firebase/firestore: Firestore (10.7.0): WebChannelConnection RPC 'Listen' stream 0x269fb953 transport errored: Wn {type: 'c', target: Hn, g: Hn, defaultPrevented: false, status: 1}
.
I'm also running into this error. Subscription seems to work fine for a while and then gets dropped with the same RPC 'Listen' stream transport error. Any ideas on what this might be or where to catch the error?
Same issue after upgrade AngularFire to 17.0.1 which depends on firebase ^10.7.0.
One of our project query becomes slower and run into the @firebase/firestore: Firestore (10.7.2): WebChannelConnection RPC 'Listen' stream
error occasionally. The other smaller project works fine.
Tried experimentalForceLongPolling
mentioned in #7968 but no luck. downgrade to 10.6.0 seems resolve the issue.
I'm also seeing the same issue with hanging snapshot queries for a while, with the same type of WebChannelConnection RPC 'Listen' stream ...
transport error.
Sometimes, after failing with the error, the snapshot query retries and returns correct data after a couple of minutes, but most times it just hangs indefinitely. In our case, it only happens with queries that would return a large amount of data (hundreds of docs containing fairly large strings).
The issues started with versions 10.4.x
. They were then fixed in versions 10.6.x
, but are now back again with 10.7.x
. I've also tested the latest 10.8.0
, and the issue is still there. As a summary:
10.3.1
: issue not present10.4.x
: issue shows up10.5.x
: issue still present10.6.x
: issue fixed10.7.x
/ 10.8.0
: issue shows up againUsing experimentalForceLongPolling
does not seem to make a difference.
I wasn't able to reproduce it in a local or staging environment, as it only seems to show up in our production environment where we have around ~40K snapshot listeners / ~10K active connections
, as reported in the Firebase console.
I'm also running into this error since upgrading to v10.7.0, and much like @phileasthefogg, getting the same RPC 'listen' stream transport error. This is a small project (< 10 active connections at a time
), and I'm able to reproduce it in both local and production environments.
Hi @ehsannas, not sure if you have been able to work on this issue? Maybe @MarkDuckworth can take a look. This issue has prevented us from updating to the latest version. Thank you!
same issue happens for my project (using flutter), in the beginning everything was fine (I've being using firestore for about 6months) but now suddenly getting all the time (maybe data sets grown, due to smaller db size didn't experience it before)
@MrDavidRios Would you be able to share your project in which you're able to consistently reproduce this issue? (feel free to point me to a github repo). Thanks!
This phenomenon seems to be more likely to occur in a slow network environment. By setting "Fast 3G" or "Slow 3G" in Network of DevTools, we were able to reproduce the phenomenon even in an environment where it does not usually occur.
(note to googlers: this may be related to support case b/325591749, which reports similar webchannel issues when the network is throttled)
Same thing happens in our project. Unfortunately I can't downgrade to firebase 10.6.0 (without much effort) because of AngularFire and Angular dependencies. It still happens on firebase 10.9.0 ...
This issue happened since December last year, affect multiple project but did not receive any update. I'm on Blaze plan but cannot update the library to the latest version and it's really frustrating. Could you please share if any of you are investigating this issue? Thank you! @MarkDuckworth @dconeybe @ehsannas
@thomasdao, I'll touch base with the team and see if I can move this forward.
This problem affects users in our production apps. We are also in the middle of developing a new app and can consistently reproduce the error. It seems to be connected to the size of Firestore documents. Our documents are max. 300,000 bytes, which is far below the limit specified on the official Firestore documentation page (1 MiB / 1,048,576 bytes) and we are fetching max. 40 documents in a single query.
We would highly appreciate if the Firebase team could check what changed in recent versions and fix it soon.
Thank you @MarkDuckworth.
Just to second @thomasdao & @jorgsiegel, this has long been a part of the stable releases and effects our users. For various reasons we are unable to downgrade. We have a long living gcp ticket open regarding this. I have a feeling this happens more often the bigger the result set is. We run an SPA, where we stream about 5000 documents. All well in the region of 1KB. When the queries fail they restart over and over. Resulting in the client downloading 100MB what should be 5MB. We have no workaround for this.
Would really appreciate to see some progress here.
We're also encountering this issue (running 10.8)
Tried 10.11 and it's still happening, but as suggested above downgrading to 10.6 fixed it
I have a potential fix for this issue. Would anyone be willing/able to test it out? The fix is in https://github.com/firebase/firebase-js-sdk/pull/8145 (NOTE: it is still a work-in-progress). Please comment on the PR with the outcome of your experiment (rather than commenting here on the issue).
You will need to build the firestore sdk for yourself, but, thankfully, it's relatively straight forward.
npm install -g yarn
git clone --depth 100 https://github.com/firebase/firebase-js-sdk.git
(if using an existing clone of this repo, make sure you're at a commit that includes #8145) ~git clone -b dconeybe/WebChannelOnOpenFix_Bug325591749 --depth 100 https://github.com/firebase/firebase-js-sdk.git
~cd firebase-js-sdk
yarn
yarn build
cd packages/firestore
yarn build:debug
cp -r dist ~/YOUR_PROJECT/node_modules/@firebase/firestore
Note that the --depth 100
argument to git
is just an optimization to pull about 8MB instead of 30MB. Feel free to omit that argument.
Note that the extra yarn build:debug
command is optional, and produces Firestore's index.esm2017.js
with all of the code mangling, code stripping, and optimizations disabled. This will produce more readable compiled code and stack traces without mangled names that are much easier to make sense of.
The "cp" command will copy the compiled Firestore JavaScript bundles into your own project's node_modules
directory, clobbering the ones that npm
downloaded. Make sure to restore the production version (e.g. by deleting the node_modules
directory and re-running npm install
) when done testing out this fix.
@thomasdao, I have a branch (markduckworth/debug-webchannel-stat-events) that will log additional events from WebChannel. This logging is showing some useful additional info before a WebChannelConnnection transport error on my device.
Can you test with this branch on your local reproduction and provide me with any log statements for "STAT_EVENT". If these events are before the WebChannelConnection RPC 'Listen' stream 0x269fb953 transport errored
event, please include those log lines too.
Your help is greatly appreciated.
@MarkDuckworth I check out your branch and follow the instruction from https://github.com/firebase/firebase-js-sdk/issues/7860#issuecomment-2052471034. Please see the log attached, thanks!
Thanks @thomasdao.
In my local tests, when I see WebChannelConnection RPC 'Listen' stream X transport errored: ...
, the STAT_EVENT logging shows that the root cause was expected/normal. Furthermore I saw the SDK recover gracefully.
In your logs, the STAT_EVENTs leading up to the WebChannelConnection error are different. I'm trying to understand why. The repro that you previously shared with me is not currently reproducing this error. Does that shared repo still reproduce the issue for you?
Also @thomasdao, can you provide the Firebase project ID you used when creating firebase_log.txt? Is it the same project ID from your shared repro? We want to review server logs.
@MarkDuckworth
The repro that you previously shared with me is not currently reproducing this error. Does that shared repo still reproduce the issue for you?
Yes, I can still reproduce this issue. Sometimes the query can complete, but the next time I run it again, the query would hang.
Is it the same project ID from your shared repro?
Yes it's the same project ID.
Version 10.11.1 was released today and rolls back the WebChannel config to be equivalent to the 10.6 (and 10.5.2) releases. I have tested with @thomasdao's reproduction and I'm seeing the queries complete consistently and quickly. Errors WebChannelConnection RPC 'Listen' stream 0x269fb953 transport errored: Wn {type: 'c', target: Hn, g: Hn, defaultPrevented: false, status: 1}
were not observed.
@MarkDuckworth thank you, I tried 10.11.1 and found the query can complete quickly.
Just curious, is WebChannel
really superior to the FetchXmlHttpFactory
? What's the problem with FetchXmlHttpFactory
?
Friends, It is already fixed by firebase team in the newest Version 10.11.1 - April 25, 2024
Cloud Firestore Prevent spurious "Backend didn't respond within 10 seconds" errors when network is in fact responding, but slowly. See GitHub PR #8145. https://firebase.google.com/support/release-notes/js
Operating System
Both Mac and Windows
Browser Version
Chrome, Electron Browser window
Firebase SDK Version
10.7.0, 10.7.1
Firebase SDK Product:
Firestore
Describe your project's tooling
Plain Electron app
Describe the problem
This is the new ticket for hanging query issue, follow up from https://github.com/firebase/firebase-js-sdk/pull/7771 and https://github.com/firebase/firebase-js-sdk/issues/7652
When update Firebase to 10.7.0 and 10.7.1, the query becomes a lot slower and frequently stuck with error below:
Switch back to 10.6.0 and the query completes quickly.
Steps and code to reproduce issue
I've created a minimal sample to reproduce this issue and have shared with @MarkDuckworth, if you need to get access to the private repo, please let me know, thank you!