googleapis / google-cloud-php

Google Cloud Client Library for PHP
https://cloud.google.com/php/docs/reference
Apache License 2.0
1.09k stars 434 forks source link

URGENT: [composer] GRPC randomly throws ServiceException("Socket closed") #2427

Closed lukasgit closed 2 years ago

lukasgit commented 4 years ago

@dwsupplee @jdpedrie this issue still randomly persists.

URGENT REQUEST. We're part of the Google Cloud Startup program and launching this year... a fix would be greatly appreciated so we can move to production.

Log output:

PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /[...]/Google/composer-google-cloud/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /[...]/Google/composer-google-cloud/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /[...]/Google/composer-google-cloud/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /[...]/Google/composer-google-clo in /[...]/Google/composer-google-cloud/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
lukasgit commented 4 years ago

@dwsupplee firestore, not spanner

jdpedrie commented 4 years ago

@lukasgit what version of the grpc extension do you have installed? You can find this value by running pecl list if you installed via PECL, or in the output of php -i or phpinfo().

lukasgit commented 4 years ago

@jdpedrie

Package  Version State
grpc     1.23.1  stable
protobuf 3.10.0  stable
jdpedrie commented 4 years ago

Thanks @lukasgit.

Does it seem to always be related to a particular service call, or does it happen randomly?

Would you be able to enable gRPC debugging and share the results relating to a failure? You can set the following environment variables to turn on debugging: GRPC_VERBOSITY=debug GRPC_TRACE=all. Please note that the debug output may include sensitive information such as access tokens, so if you'd prefer, you may send it to me via the email address on my profile.

lukasgit commented 4 years ago

@jdpedrie randomly, and I'm not seeing any gRPC debugging info in the php logs:

putenv('GRPC_VERBOSITY=debug');
putenv('GRPC_TRACE=all');
jdpedrie commented 4 years ago

@lukasgit I'm sorry I was unclear. Setting those variables will cause additional logging data to be written to stderr, not to the PHP log.

lukasgit commented 4 years ago

@jdpedrie not seeing any gRPC debugging info to stderr.

php.ini display_errors = stderr

lukasgit commented 4 years ago

@jdpedrie ^^

jdpedrie commented 4 years ago

Hey @lukasgit sorry for not responding sooner. I'm working on this though, hope to have more by the end of the day, or tomorrow at latest.

lukasgit commented 4 years ago

@jdpedrie great, thanks for the update.

jdpedrie commented 4 years ago

Are you using PHP-FPM? Make sure your configuration has catch_workers_output=yes. This should cause the workers' stderr output to be written to the server error log.

lukasgit commented 4 years ago

@jdpedrie still nothing... here is what I have configured:

nginx-1.17.5 (dev environment compiled with --with-debug):

error_log logs/error.log debug;

php-fpm (php-7.3.11):

catch_workers_output = yes
env[GRPC_VERBOSITY] = debug
env[GRPC_TRACE] = all

php code:

putenv('GRPC_VERBOSITY=debug');
putenv('GRPC_TRACE=all');
jdpedrie commented 4 years ago

Hi @lukasgit, I'm sorry again for the back-and-forth. I've been trying without much success to capture the gRPC debugging data from nginx/php-fpm.

Do you have control over the system which is managing the FPM daemon? supervisord and systemd for instance? Are you able to configure php-fpm to start with an explicit pipe of stderr to file?

php-fpm 2>/var/log/php-fpm.log

Additionally, I spoke with a contact on the gRPC team, and he suggested you toggle the debug and verbosity a bit differently than what I advised earlier:

GRPC_VERBOSITY=debug                                                                                    
GRPC_TRACE=api,call_error,channel,client_channel_call,connectivity_state,handshaker,http,subchannel,tcp

I've opened an issue on gRPC to improve the utilities for capturing gRPC debugging information in PHP.

lukasgit commented 4 years ago

Hi @jdpedrie, no worries on the back-and-forth. Whatever it takes for us to resolve this issue.

I do have full control over the development system. Following your instructions, still nothing related to gRPC in /var/log/php-fpm.log :

[08-Nov-2019 21:24:18] NOTICE: fpm is running, pid 1104
[08-Nov-2019 21:24:18] NOTICE: ready to handle connections

nginx-1.17.5 (dev environment compiled with --with-debug):

error_log logs/error.log debug;

php-fpm (php-7.3.11):

catch_workers_output = yes
env[GRPC_VERBOSITY] = debug
env[GRPC_TRACE] = api,call_error,channel,client_channel_call,connectivity_state,handshaker,http,subchannel,tcp

php code:

putenv('GRPC_VERBOSITY=debug');
putenv('GRPC_TRACE=api,call_error,channel,client_channel_call,connectivity_state,handshaker,http,subchannel,tcp');
jdpedrie commented 4 years ago

Hi @lukasgit,

@stanley-cheung, the person I've been talking to on the gRPC team, mentioned that in his tests using Apache, he found that setting the environment variables with putenv and even in the server configuration was too late and they were ineffective. In his test using docker, he set them in the Dockerfile and had better luck. Could you try setting them at the highest level possible? Perhaps etc/environment or similar.

stanley-cheung commented 4 years ago

Hi @lukasgit I am one of the maintainers of the grpc php extension.

The only thing that seems to work for me (in terms of the php-fpm + nginx setup), is to do the combination of these 2:

I have tried all those catch_workers_output = yes, env[GRPC_VERBOSITY] = debug in the fpm www.conf file but none of those work. Doing putenv() in a .php script seems definitely too late.

But even with the php-fpm 2>mylog trick, the log is being outputed kind of ugly with " [pool www] child 12 said into stdout: " and lines got broken up.

So I am currently working on adding a php.ini option for the grpc extension that we can divert all those grpc logs into a separate log file, tracked in here. Will keep you updated.

But also just so I know, for the initial error, how rare / often does that happen? Are we talking about 1 every 10 request, or 1 every 10000 request? Just want to see what the scale of things is.

lukasgit commented 4 years ago

Hi @stanley-cheung thanks for jumping in on this issue.

I will hang tight until the php.ini option for the grpc extension is available.

As for how many times the error occurs per x requests, it varies widely. Sometimes it happens 7 times out of 50 requests. Sometimes 0 times out of 1000 requests. We haven't ran a stress test of 10,000 requests on grpc yet.

stanley-cheung commented 4 years ago

I started this PR: https://github.com/grpc/grpc/pull/20991. This works for the most part but may need some more polish. I am slightly concerned about the lack of file rotations / capping the size of the log file - as it stands, the log file will grow unbounded. There might be a need for a separate cronjob or something to monitor and regularly truncating this log file.

lukasgit commented 4 years ago

Hi @stanley-cheung any update on this issue? Thanks :)

stanley-cheung commented 4 years ago

@lukasgit Sorry for the late reply. We are very close to cutting a 1.26.0RC1 release candidate, which will contain that fix. Once that's done, will you be able to install that in your environment and enable the grpc log to a separate log file? You might need to monitor the file size growth if the error was not happening frequently.

lukasgit commented 4 years ago

@stanley-cheung since our last chat, we upgraded the composer version and only had the error happen one time (yesterday). So yeah, it's still very random. I can definitely run your next RC, just provide me with instructions when you're ready. Thanks!

stanley-cheung commented 4 years ago

@lukasgit 1.26.0RC1 should be available on PECL now. If you just do [sudo] pecl install grpc-beta, that should be the version installed.

Please add the following lines to your php.ini file:

grpc.grpc_verbosity=debug
grpc.grpc_trace=all,-timer_check
grpc.log_filename=/var/log/grpc.log

You can add more traces to ignore by adding more minus '-' sign to the grpc.grpc_trace option. See this doc for all the possible values for that option. There are potentially a few you can add to trim down the volume of the log.

digarahayu commented 4 years ago

i got same issue on my local but on production working fine with same version

stakable commented 4 years ago

Have a similar issue with our local env. Every 4th request to Firestore from PHP fails with "unavailable" and an empty message. Code 14.

Test code

$ref = $firestore->collection('users');
$doc = $ref->document('someuser');
$snap = $documentReference->snapshot();
return $snap->exists();

PHP 7.2.10 grpc extension v1.26.0 google/protobuf ^v3.3.0

jdpedrie commented 4 years ago

@digarahayu @stakable could you try enabling gRPC tracing as @stanley-cheung described above and sharing some logs from the error cases?

ulver2812 commented 4 years ago

I think this issue is related to mine #2539. Fatal error: Uncaught Google\Cloud\Core\Exception\ServiceException: { "message": "Empty update", "code": 14, "status": "UNAVAILABLE", "details": [] }

adammeyer commented 4 years ago

I'm seeing this about twice a day in production. Hundreds of other times it works fine. Using AppEngine standard, 7.3.

edi commented 4 years ago

Same here. I recently started getting quite a few of these ... like once every 10 minutes.

PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /lib/firestore/vendor/google/cloud-core/src/GrpcRequestWrapper.php:257

PHP 7.4.4 (Debian 10 x64) Nginx 1.10.3

It's very annoying to get these in production. I understand we can try these requests with an exponential backoff ... but that is not the case as sometimes we are having requests processing card authorizations, and we cannot store all those credentials in order to re-run at a later point.

Please advise.

edi commented 4 years ago

An update, as I can't seem to be able to edit the previous message. This never happened on a server from another ISP (Hetzner).

On the other hand, on the servers at DigitalOcean, it happens quite often, sometimes multiple times per hour, sometimes once per day, with random methods, which don't necessarily have big payloads.

So, may it be some php-fpm config issue? Any child process killing the socket? Not sure how it works so .. I'm just throwing ideas around.

jdpedrie commented 4 years ago

@edi are you able to try capturing some of the tracing data as described by @stanley-cheung above?

lukasgit commented 4 years ago

@jdpedrie @stanley-cheung we've noticed the issue persists but something of interest might be happening.

The issue seems to suddenly stop when we update the composer package google/cloud.

Is there possibly a correlation between these random "Socket closed" errors due to changes on your end that might not be compatible with a prior version of google/cloud composer package?

jdpedrie commented 4 years ago

Do you know what the previous version of google/cloud was where the issue appeared? We'll have to look at what has changed between then and now. But do I understand correctly that the latest version of the client seems to no longer have the issue? If so, that's great news!

edi commented 4 years ago

On the server with the issue, I have "google/cloud-core": "^1.31", which is very weird becase I've installed it myself via composer require about two weeks ago.

Regarding grpc, I have 1.27.0 installed, and I'm now upgrading to 1.28.1, hopefully it sorts it out.

This is how many times it happened for me so far, in the past 48 hours.

I've also re-installed cloud-core to 1.36.1 (given the OP's success with it), will be back tomorrow with an update.

Regards.

jdpedrie commented 4 years ago

Thanks @edi, please do let us know.

edi commented 4 years ago

Well gents, so far so good after updating to cloudcore@1.36.1 👍

As a sidenote, the server which didn't get a cloud-core@1.31 update but only a gRPC one, still has the issues.

Another server running the same gRPC@1.27.0 as the one having issues, has cloud-core@1.35 installed. That didn't have this kind of issues either.

So, the issue is obviously not with gRPC (1.27 or 1.28.1), but with the older version of cloud-core.

1st server -> cloud-core 1.31 gRPC 1.28.1 -> BAD 2nd server -> cloud-core 1.35 gRPC 1.27.0 -> OK 3rd server -> cloud-core 1.36.1 gRPC 1.28.1 -> OK

edi commented 4 years ago

I may have spoken too soon. Still happens on cloud-core@1.36.1 and gRPC@1.28.1 Times when it happened today

Will try implement the logs mentioned above and come back with some more info ... it's so weird as the same exact setup on another server at another ISP and I have never had this issue.

lukasgit commented 4 years ago

@jdpedrie so far no errors in the last 5 days:

Using version ^0.131.0 for google/cloud
./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)
Package operations: 22 installs, 0 updates, 0 removals
  - Installing google/crc32 (v0.1.0): Loading from cache
  - Installing psr/cache (1.0.1): Loading from cache
  - Installing psr/http-message (1.0.1): Loading from cache
  - Installing ralouphie/getallheaders (3.0.3): Loading from cache
  - Installing guzzlehttp/psr7 (1.6.1): Loading from cache
  - Installing guzzlehttp/promises (v1.3.1): Loading from cache
  - Installing guzzlehttp/guzzle (6.5.2): Loading from cache
  - Installing firebase/php-jwt (v5.2.0): Downloading (100%)         
  - Installing google/auth (v1.8.0): Downloading (100%)         
  - Installing google/protobuf (v3.11.4): Downloading (100%)         
  - Installing google/common-protos (1.2): Downloading (100%)         
  - Installing grpc/grpc (1.27.0): Downloading (100%)         
  - Installing google/grpc-gcp (0.1.4): Loading from cache
  - Installing google/gax (1.3.0): Downloading (100%)         
  - Installing symfony/polyfill-ctype (v1.15.0): Downloading (100%)         
  - Installing ramsey/collection (1.0.1): Downloading (100%)         
  - Installing brick/math (0.8.15): Downloading (100%)         
  - Installing ramsey/uuid (4.0.1): Downloading (100%)         
  - Installing psr/log (1.1.3): Downloading (100%)         
  - Installing monolog/monolog (2.0.2): Loading from cache
  - Installing rize/uri-template (0.3.2): Loading from cache
  - Installing google/cloud (v0.131.0): Downloading (100%)         
edi commented 4 years ago

I have decided to move on to a new Hetzner server. Just set up a new droplet on DO, basic nginx / php setup. (debian 10 x64), nothing special.

Then you'll start randomly getting those socket closed errors, with the latest binaries and SDKs.

Back on Hetzner for 3 days now, no error whatsoever.

lukasgit commented 4 years ago

Back on Hetzner for 3 days now, no error whatsoever.

@edi other than your setup being on a different hosting provider, was there a change in your server configuration or code ?

lukasgit commented 4 years ago

@jdpedrie @stanley-cheung This is not acceptable for a mission critical datastore. The random "status": "UNAVAILABLE" errors with Firestore is preventing us from going live.

Since April 22, everything has been working just fine. We thought perhaps you unknowingly figured out the problem. I left this issue open as a precaution.

As of June 6, every day we are once again receiving these random "status": "UNAVAILABLE" errors.

We upgraded the composer package today. In the upgrade we can see grpc-gcp changed from 0.1.4 to 0.1.5.

Latest upgrade on June 10 (today):

Previously installed package (since April 22):

stanley-cheung commented 4 years ago

@lukasgit Just want to clarify - these errors didn't happen between ~Apr 16-ish and Jun 6-ish?

lukasgit commented 4 years ago

@stanley-cheung Confirmed - these errors did not happen between ~Apr 16-ish and Jun 6-ish

stanley-cheung commented 4 years ago

2 suggestions:

  1. would you by any chance be able to pin down these versions to those as of April 22 you listed above? That way we can try to see if it's because of those packages? Coz, conceivably, even if you pin those versions back - these UNAVAILABLE errors may still happen. Then that's likely because of those packages.
  2. Are you able to capture the grpc logs as per my comment here? https://github.com/googleapis/google-cloud-php/issues/2427#issuecomment-562334183. If we can see the grpc logs for those calls - that will help.

And just as a guess - there are 3 main areas these errors could come from (and the fact that these errors did not happen between Apr 16 and Jun 6 didn't help much to narrow it down):

  1. some code on the client libraries change and are responsible for it. I personally think this is unlikely. Client side changes are unlikely to cause a server to return UNAVAILABLE.
  2. some network issues. This is the hardest to debug. This highly depends on your setup. Where are you running the clients? What are the hops between that and the Firestore servers? Are you going across regions? Anything changed in one of those hops could have caused the issues.
  3. Something changed on the Firestore side or the google API serving side. Even that has many layers and possibilities.

In any case, we need some more concrete things before we can really look into this:

lukasgit commented 4 years ago
  1. would you by any chance be able to pin down these versions to those as of April 22 you listed above? That way we can try to see if it's because of those packages? Coz, conceivably, even if you pin those versions back - these UNAVAILABLE errors may still happen. Then that's likely because of those packages.

Between April 6 and June 10, we were using version ^0.131.0 for google/cloud which contained google/grpc-gcp (0.1.4)

As of today, June 10, we upgrade to version ^0.133.1 for google/cloud which contains google/grpc-gcp (0.1.5)

  1. Are you able to capture the grpc logs as per my comment here? #2427 (comment). If we can see the grpc logs for those calls - that will help.

I will look into this again.

And just as a guess - there are 3 main areas these errors could come from (and the fact that these errors did not happen between Apr 16 and Jun 6 didn't help much to narrow it down):

  1. some code on the client libraries change and are responsible for it. I personally think this is unlikely. Client side changes are unlikely to cause a server to return UNAVAILABLE.

If no errors come up after we upgraded today, to the latest version ^0.133.1 for google/cloud, the probability of this being a client library issue must be considered.

  1. some network issues. This is the hardest to debug. This highly depends on your setup. Where are you running the clients? What are the hops between that and the Firestore servers? Are you going across regions? Anything changed in one of those hops could have caused the issues.

The client is running on a GCE Debian instance located in Zone us-east1-b. The Firestore Database location: nam5 (us-central)

  1. Something changed on the Firestore side or the google API serving side. Even that has many layers and possibilities.

This would imply Firestore is not production ready if multiple connection attempts are randomly denied.

In any case, we need some more concrete things before we can really look into this:

  • Are there a reproducible test case? For example, if you can give us a docker image, have us replace our own credentials, run it for a few days to try to see if we can see the errors for ourselves?

As I and many others have stated in this thread, the issue is completely random. I don't believe it's a problem on our end if we're all experiencing the same issue. I filed this urgent issue back in October 2019 and we are still dealing with it as of June 2020. Furthermore, I would presume there's many others in production that aren't reporting it because their users retry when an error occurs. That is no excuse for a mission critical datastore.

  • Or as mentioned above, if you can capture some logs on your side that will help.
$ cat php_error.log 
[06-Jun-2020 15:22:26 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[09-Jun-2020 02:22:13 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[09-Jun-2020 21:45:28 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[09-Jun-2020 21:50:42 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[10-Jun-2020 16:00:07 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[10-Jun-2020 17:26:26 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
[10-Jun-2020 17:38:45 UTC] PHP Fatal error:  Uncaught Google\Cloud\Core\Exception\ServiceException: {
    "message": "Socket closed",
    "code": 14,
    "status": "UNAVAILABLE",
    "details": []
} in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Firestore/src/SnapshotTrait.php(122): Generator->current()
#3 /opt/Google/composer-google-cloud/composer-google-clo in /opt/Google/composer-google-cloud/composer-google-cloud-20200415/vendor/google/cloud/Core/src/GrpcRequestWrapper.php on line 257
stanley-cheung commented 4 years ago

These are just the PHP level logs. The comment I had here https://github.com/googleapis/google-cloud-php/issues/2427#issuecomment-562334183 will enable logs from the grpc C extension, which can capture more of the underlying networking layers.

By the way, can you confirm the version of the grpc extension being used? Please run

php --re grpc | head -1

lukasgit commented 4 years ago

Totally forgot about the extensions.... here's what is currently installed. We will update today.

Do you believe this is the underlying issue?

$ php --re grpc | head -1
Extension [ <persistent> extension #35 grpc version 1.23.1 ] {
$ php --re protobuf | head -1
Extension [ <persistent> extension #36 protobuf version 3.10.0 ] {
stanley-cheung commented 4 years ago

Could be. Version 1.23.1 is out about 10 months by now. We are currently at 1.29.1 and 1.30.0 should be coming out in a week or so. So you can try upgrading.

Protobuf I'd recommend either 3.11.4 or 3.12.2.

stanley-cheung commented 4 years ago

Actually, also, which version of Debian are you running?

joshuaoliver commented 4 years ago

This is an issue we get totally randomly as well, and have been unable to trace, following this if the rest of you manage to find some fix.

https://sentry.io/share/issue/5ede539ebefb4e18a09cf9415c9a7b9b/

sl0wik commented 4 years ago

I'm fighting with this issue for months. I'm running my apps in App Engine standard (7.3), and this issue basically forced us to remove firestore from various services.