googleapis / gax-ruby

Google API Extensions for Ruby
https://rubygems.org/gems/google-gax
BSD 3-Clause "New" or "Revised" License
20 stars 22 forks source link

GaxError Exception occurred in retry method #40

Closed quartzmo closed 8 years ago

quartzmo commented 8 years ago

Hi,

While using Gax in google-cloud-language development, both my local testing and our Travis CI build sometimes get the following error. Retrying the call usually succeeds.

The backtrace pasted below is from Job #1433, which should contain all details about the environment.

Thank you!

Google::Gax::RetryError: GaxError Exception occurred in retry method that was not classified as transient, caused by 14:{"created":"@1472169812.007530609","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1472169812.007501620","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]}
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:352:in `rescue in block (2 levels) in retryable'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:346:in `block (2 levels) in retryable'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:345:in `loop'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:345:in `block in retryable'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:264:in `call'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:264:in `block in catch_errors'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:226:in `call'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:226:in `block in create_api_call'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:252:in `call'
    /home/travis/.rvm/gems/ruby-2.2.5/gems/google-gax-0.4.4/lib/google/gax/api_callable.rb:252:in `block in create_api_call'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/v1beta1/language_service_api.rb:198:in `call'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/v1beta1/language_service_api.rb:198:in `annotate_text'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/service.rb:80:in `block in annotate'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/service.rb:105:in `execute'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/service.rb:80:in `annotate'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/lib/google/cloud/language/project.rb:176:in `annotate'
    /home/travis/build/GoogleCloudPlatform/google-cloud-ruby/google-cloud-language/acceptance/language/text_test.rb:60:in `block (3 levels) in <top (required)>'
blowmage commented 8 years ago

Are errors like this to be expected? Do clients need to do anything to handle these errors?

jmuk commented 8 years ago

Hmm. It looks like a gRPC error occurred there (and GAX thinks it's not a retryable error, so it raises an exception).

I've never seen such error as long as I tried locally, so I am not sure why it happens on you or Travis. description":"Secure read failed" looks weird -- it happens some exceptional network failure happens.

blowmage commented 8 years ago

@quartzmo has had this error locally, but I've never seen it.

jmuk commented 8 years ago

Looked into the message a bit more, the core reason of the failure seems to be "description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235, which is seen at the end of the first line. And that happens when recvmsg (in C) returns 0.

According to http://pubs.opengroup.org/onlinepubs/009695399/functions/recvmsg.html,

If no messages are available to be received and the peer has performed an orderly shutdown, recvmsg() shall return 0.

I am not sure why the connection was closed by the peer (i.e. google service), but I think that would be a network trouble on the machine.

quartzmo commented 8 years ago

Network trouble on the Google service? Or the client?

jmuk commented 8 years ago

I'm thinking about the client-side trouble. In the case of Travis, I believe that the network connection from the Travis instance will be managed/supervised by Travis itself, and the connection could be forcibly closed by them for some reasons (time consuming, for example).

jmuk commented 8 years ago

Ugh, I've heard that some troubles happened on Google's service-side very recently and that might have caused this issue. They fixed the code, so, please double check if it still reproduces.

quartzmo commented 8 years ago

I can no longer reproduce, and I haven't seen it on Travis CI today either. Thanks for your help!

hxiong388 commented 8 years ago

I'm seeing this error in both error_reporting and logging when running locally and on GAE. The error isn't consistent, and it doesn't seem to prevent my gRPC requests to go through successfully.